The TIKKTOK Bytedance company releases the new Open Source-36B seed model with 512K token context



Tiktok appears again on the first pages of newspapers The White House has joined the popular application on social media – but its parent company BytedanceThe Chinese web giant was also surprised.

Business AI SEPs team Today it has spent seeds-os-36b On the AI ​​Hugging Face code sharing the code.

- Advertisement -

Seed-OSS-36B is a new open source line, large language models (LLM) designed because of advanced reasoning and usability focused on programmers with longer context of token – It means how much information models can accept as input data, and then go out in one exchange – than many competitive LLM from American technology corporationsEven leaders like OpenAI and Anthropic.

The collection introduces three primary variants:


AI scaling hits its limits

Power capitals, the growing costs of the token and inference delay are transforming AI Enterprise. Join our exclusive salon to find how the best teams are:

  • Changing energy into a strategic advantage
  • Architect of effective inference regarding real capability profits
  • Unlocking competitive roi using balanced AI systems

Secure your home to stay ahead: https://bit.ly/4mwgni


  • Seed base-36B with synthetic data
  • Seed base-36B without synthetic data
  • Instruction seed-ooss-36b

By dropping each synthetic and non-synthetic versions of the NASE-36B base model, the seed syndrome tried to balance practical results with research flexibility.

. Variant Synthetic Date, Trained with additional instructions, consistently provides stronger results on standard comparative tests And it is intended as an option with higher general performance.

. non -synthetic model, However, he omits these extensions, creating a cleaner foundation that avoids potential prejudice or distortion Introduced by a given synthetic instructions.

By providing each, the team gives users access to improved results, while providing researchers with a neutral bottom line for testing methods after training.

Meanwhile Model Instructure-OSS-36B It differs that it is so After training with the given instructions To determine the priorities of performing tasks and observations, as an alternative of serving only as a foundation model.

All three models are issued on the basis of the APACHE-2.0 license, enabling free use, modification and redistribution by scientists and programmers working for enterprises.

It means They will be used to power business, internal applications to the company or external/customer, without paying any license fees or using the application programming interface (API).

It continues Summer 2025 Trend of Chinese corporations sending powerful Open Source models Opeli is attempting to catch up with its own GPT-OSS duo with the opening of Open Source released at the starting of this month.

Seed team position Seeds for international applicationsemphasizing versatility in reasoning, performing tasks paying homage to an agent and multilingual settings.

The seed team, created in 2023, focused on building models of foundations that will be supported by each research and applied cases.

Design and basic functions

Architecture standing for seeds-OS-36B combines well-known design decisions, comparable to causal modeling, grouped attention attention, activation of swigl, coding of RMSNORM and ROPE positions.

Each model transfers 36 billion parameters on 64 layers and supports the vocabulary of 155,000 tokens.

One of the defining features is his native long -term ability, with a maximum length of 512,000 tokens, Designed for processing prolonged documents and reasoning chains without lack of performance.

This is twice as much as the new model family GPT-5 Openai and it is more or less equivalent of about 1,600 pages of the text, The length of the Christian Bible.

Another distinctive element is the introduction Thinking budgetwhich allows programmers to find out how much reasoning ought to be done before providing answers.

It is also something that we have also seen from the other latest Open Source models, including the new Nemotron-Nano-9B-V2 Nvidia Available on hugging your face.

In practice, because of this teams can tune the performance depending on the complexity of the task and the requirements of the implementation performance.

Budgets are really helpful in multiples of 512 tokens, with 0 provided by the direct answer mode/

Competitive results on third -party references

Benchmarks published with the PEOC-36B release position among the stronger large Open Source models. The instruction variant in particular publishes the latest results in many areas.

  • Mathematics and reasoning: Achieves instruction seeds-36b 91.7 percent at Aime24 AND 65 on BeyondaimeBoth representing the “most modern” (sota).
  • Coding: In Livecodebench V6, model instructor 67.4Another SOTA result.
  • Long contact service: For ruler with a length of the context of 128 thousand Achieves 94.6Determination of the highest reported Open Source result.
  • Basic model performance: Variant of the synthetic basics Date provides 65.1 on mml-PRO AND 81.7 at mathematicsBoth most up-to-date results in their categories.

Without a synthetic basic version, although a bit delaying in many means, it seems to be competitive.

This exceeds its synthetic counterpart on GPQA-D, Providing the researchers with a cleaner, without a distinguished manual for experiments.

In the case of enterprises comparing open options, these results suggest that seeds offer strong potential in all mathematics, coding and long -term loads while ensuring flexibility for the use of tests.

Access and implementation

In addition to performance, the seed team emphasizes the availability of programmers and practitioners. Models will be implemented with the help of face hugging transformersWith Quantization support in each 4-bit and 8-bit formats to scale back memory requirements.

They too Integrate with VLLM for a scalable portionincluding examples of configuration and API server instructions.

To lower the barriers much more, the team comprises scripts of inference, quick adjustment and integration of tools.

For Technical leaders manage small teams or work in accordance with budget restrictionsThese provisions are arranged so that experiments with a model of 36 billion parameters are more cost-effective.

Licensing and considerations for company decision makers

Thanks to models offered under Apache-2.0, organizations can accept them without limiting the conditions of licensing, which is an necessary factor for teams balancing legal and operational problems.

In the case of decision makers assessing the Open Source landscape, the edition brings three results:

  • The most up-to-date comparative tests in the reasoning of mathematics, coding and long contact.
  • Balance between a synthetically trained model with higher results and clean research base lines.
  • Availability functions that lower operating costs for Lean engineering teams.

By placing good results and flexible implementation based on an open license, the Byedetance seed team added new options for enterprises, researchers and programmers.

Latest Posts

Advertisement

More from this stream

Recomended