GitHub

Ali Baba released Qwen 3.7-Plus this week, the latest AI mega-linguistic model in its globally popular and expanding Qwen series.LLMIt has more multimodular features and costs 60% less than the Qwen3.7-Max model, which was released a few weeks ago. However, as with its predecessor, Qwen 3.7-Plus can only be programmed via proprietary application interfaces (API) and Qwen Chat is used under a “closed” commercial licence. This represents a significant departure from the Qwen strategy to date, which focuses mainly on the release of powerful, near the most advanced open-source models. Those businesses and users that rely on open-source Qwen models — including giants such as Airbnb — will undoubtedly be disappointed that Ali Baba has closed his new version. Nevertheless, the model is worth looking at because it has low-cost and high performance in multi-modal tasks, such as creating enterprise-level visual effects or analysing videos, images and screenshots, which Qwen 3.7-Max is unable to do (it is only text). It is currently one of the cheapest powerful artificial intelligence models, at a price slightly higher than the time-limited discount for China’s new competitor MiniMax-M3. Venture Beat Frontier AI model API pricing snapshot

The model is entered, the total cost of the output is sourced,

MiMo-V2.5 flashback 0.10 US$ 0.30 US$ 0.40 US$

Depth search -v4-flash | 0.14 US$ 0.28 US$ 0.42 US$ |

Depth search -v4-pro | 0.435 US$ 0.87 US$ 1.305 US$ |

Mini Max-M3 | 0.30 US$ | 1.20 US$ | 1.50 US$ |

Qwen3.7-Plus | 0.40 US$ | 1.60 US$ | 2.00 US$ |

Gemini. 3.1 Flash-Lite | 0.25 USD | 1.50 USD | 1.75 USD |

MiMo-V2.5 | 0.40 US$ 2.00 US$ 2.40 US$ |

Grok 4.3 Low context | $1.25 | $2.50 | 3.75 |

GLM-5 | 1.00 US$ 3.20 US$ 4.20 US$ |

Kimi-K2.6 | $ 0.95 | $ 4.00 | $4.95 |

GLM-51 | 1.40 US$ 4.40 US$ 5.80 US$ |

Grok 4.3 High Context | 2.50 US$ 5.00 US$ 7.50 US$ |

Qwen3.7-Max | 2.50 US$ 7.50 US$ 10.00 US$ |

Duplex 3.5 flash | 1.50 US$ 9.00 US$ 10.50 US$

Gemini 3.1 Pro preview 200K | 2.00 US$ 12.00 US$ 14.00 US$

GPT-54 US$ 2.50 US$ 15.00 US$ 17.50 US$

Gemini 3.1 Pro Preview > 200K | 4.00 US$ 18.00 US$ 22.00 US$

Claude. Opus 4.8 | 5.00 | 25.00 | 30.00 |

GPT-5.5 | 5.00 US$ 30.00 US$ 35.00 US$

Maintain continuity in complex tool implementation cycles

For technical decision makers who deploy autonomous agents, the main bottleneck is rarely the initial model intelligence. Rather, it is a decline in status — a tendency for the proxy framework to lose its analytical trajectory in multi-step, long-term missions. Qwen3.7-Plus addresses this structural gap through a combination of context management and reasoning. The model is accompanied by 1 million token context windows and assigns up to 256 k tokens exclusively for internal thought chain processing. To place this capacity in context, imagine an automated cloud migration agent: It can ingestion the entire code library, map the dependency relationship and spend thousands of tokens quietly assessing the edges before executing a line of bash scripts. Most importantly, the API released a parameter called "preserve thinking"

...'In the Aribaba ecosystem, this function serves as a standard framework bridge rather than a layer of welfare. Ali Baba introduced this feature in the previous generation of Qwen 3.6 and integrated it into an open-weight Qwen 3.6-27B and a proprietary Max model. The core is that the parameters run at the API and template levels to keep the inside Rotation across continuous dialogue. This structural continuity addresses key bottlenecks in the design of long-term tasks by developers. By maintaining the integrity of these internal logical cycles, this function prevents the model from recalculating its cache history in the course of its operation by discarding its context or unnecessarily. When the model performs complex multi-step proxy coding tasks, the reservation allows the system to retain its original idea without losing the context or forgetting the basic logic of its previous operation. Ali Baba is not the only company that recognizes the need for this technology, as this basic concept now determines the structure of almost all major artificial intelligence laboratories. Anthropic This feature, called “extended thinking”, was deployed in its advanced model, including the latest Claude Opus 4.8. The framework requires developers to provide direct feedback back to API on unmodified thinking in the subsequent round to maintain an uninterrupted chain of reasoning. OpenAI The same challenge was resolved through a mechanism of encrypted reasoning echoing models such as GPT-5.5. In the OpenAI ecosystem, developers must return to the specific reasoning that was generated with the previous function to ensure that the model clearly remembers the rationale behind the implementation of its tool. Finally, keep thinking.

It merely represents the term Ali Baba, which has quickly become an indisputable bet of modern multi-wheel reasoning. The benchmark test showed a model that was competitive but not yet at the most advanced level

In terms of seed capacity indicators, this structure of in-depth thinking translates into structural gains across multiple models and proxy benchmarks. However, it is still below many leading and former American proprietary models, such as Claude Opus 4.6 in Anthropic and GPT-5.4 in OpenAI. In Terminal Bench 2.0-Terminus (measurement of the ability of the model to operate the actual end-level code in a safe and iterative manner), Qwen 3.7-Plus was divided into 70.3, better than DeepSeek-V4-Pro Max (67.9) and Gemini-3.1 Pro (63.5). In computer visual benchmarking tests that need to be understood in local interfaces, such as ScreenSpot Pro, the model has reached 79.0 and significantly exceeds those in traditional industries such as GPT-5.4 (xheigh) 67.4 and Claude-Opus-4.6 49.5. proxy assessment indicator (selected baseline)

What purpose should enterprises consider for Qwen 3.7-Plus? For corporate architects, the key question in analysing Qwen 3.7-Plus is clear: What is it replacing in our current technology stack? The model is intended to directly replace the high frequency developer workflows, robotic process automation (RPA) and the main front-line models in the data engineering pipeline (e.g. GPT-5 or Claude-Max). The technical team can route these tasks to Qwen 3.7-Plus instead of deploying expensive generic flagship models to handle duplicate system operations. It also handles visual interface interpretation, command execution and code generation. Ali Baba has built its API delivery to align with existing open-source and proprietary business frameworks. These endpoints are fully compatible with OpenAI, which means that the minimum infrastructure adjustments are required to replace the existing dependencies. For groups using autonomous terminal frameworks, integration across multiple environments is home-grown. Engineers can directly run Qwen 3.7-Plus by changing basic environmental targets through local terminal settings. From a purely cost point of view, it would soon become costly to run an agent framework that constantly quotes a large number of code repositories or visualizes the layout history. Ali Baba solves this problem through the disclosure of fine cache prices. The cost of standard input processing is $0.40 per million of tokens, but if the agent reads it from a cache created in a visible manner (e.g. a large base repository or a standard UI toolkit that maintains static in hundreds of automated cycles), the cost of subsequent reading will drop sharply to $0.04 per million token. This layer makes HF, multiple-wheel agent, economically viable in terms of enterprise size. Lack of open-source licences or open-source weights raises compliance issues for enterprises

In assessing any model in the Qwen ecosystem, the legal and security team is primarily concerned with the licensing framework and operational boundaries of the data pipeline. Although the Qwen series was preceded by an iterative generation that gained significant business attractiveness through full open-source weight availability or customised open-use licences under Apache 2.0, Qwen 3.7-Plus was provided through Aliyun Mode Studio as a strictly trusted commercial cloud API. This distinction has specific implications for enterprise risk management:

No local weight deployment: the organization cannot download, sandbox or local hosting Qwen 3.7-Plus weights in a fully isolated internal data centre. All data validation, visualization processing and call-up must go through Aliyun ' s international endpoint (e.g. the case of Singapore highlighted in the developers ' files). Compliance and sovereignty: As the model requires cloud-based reasoning, companies operating under strict sovereign data boundaries (e.g. health-care entities or defence contractors subject to local HIPAA/GDPR) must clearly assess whether external API routes are consistent with their specific data presence obligations. Host risk mitigation: instead, the hosting API structure eliminates the internal infrastructure burden of configuration, optimization and maintenance of multiple GPU clusters (e.g. dedicated Nvidia H100 arrays) for hosting the internal proxy network. Despite this, Qwen 3.7-Plus provides high-mode intelligence at low cost.

The initial reactions of the developers ' community and technology risk investments highlighted the economic changes in proxy deployment. @Boxmining highlights the strategic cost advantage:

"Qwen 3.7 Plus is 40% cheaper than Max, which changes people's perception. If the output is close enough for most of the codes and is more powerful for visual workflows, do you really need Max every day, or just for heavy terminal work?"

This view is in line with the current trend towards optimizing the enterprise ' s operating budget: moving from raw, unfettered calculations to mission-specific automation. At the same time, professional researchers who go deep into ecosystems point out that this is not just an incremental optimization of text generation. The research intern in Ali Baba Qwen, Lu Liangjie, said:

“It has demonstrated a significant improvement in computer use functionality compared to Qwen3.6-Plus, which is more generic and extends beyond general desktop tasks to professional workflows such as data engineering and scientific research.”

Ultimately, Qwen 3.7-Plus offers a practical alternative to business buyers who decide on the next infrastructure road map. If the main objective of your organization is to build an autonomous software cycle that is flexible and visual and interacts directly with the developer ' s environment and cloud console, and does not exceed your reasoning budget, the model provides a compelling reason to shift implementation from a more expensive front-line alternative.

![Qwen3.7-Plus of Ari Baba supports text, video and image input at a cost of 0.4/1.6 per million DF - it is proprietary](https://images.ctfassets.net/jdtwqhzvc2n1/3PD7bcZwfl7LUqX2wWEFux/05c4df4be513499b49ea1d9e79009cc7/Gemini_Generated_Image_tmex3utmex3utmex.png?w=800&q=75)