DeepSeek opens 3FS — file system code for AI training servers
During Open Source Week, DeepSeek made its Fire-Flyer Fire System (3FS) parallel file system publicly available. According to the company, 3FS reaches 7.3 TB/s aggregate read speed on its own server data clusters, where it has been used since 2019.
ChatGPT «thinks» in Chinese — users noticed characters in the model justifications
ChatGPT with language model o1 spends “more time thinking” for a better response. Users have found that “thinking” sometimes occurs in Chinese, regardless of the query language.
A YouTuber has created a cluster of five Apple Mac mini M4s — how effective is it?
The video blogger demonstrated a computing cluster of new Apple Mac minis running on the M4 processor. Sometimes it is better than a powerful video card.
Gemini 1.5 Flash — fast multimodal Google model with a contextual window of 2 million tokens
Google announced announces the release of Gemini 1.5 Flash, a small multimodal model designed to scale and solve narrow high-frequency problems.
Meet MAI-1: Microsoft’s new 500 billion-parameter AI model that aims to «beat» GPT
Microsoft seems to be working on its own large language model, which will potentially become a major competitor for AI Google, Anthropic і OpenAI — despite the fact that the corporation itself invested $10 billion from the developer ChatGPT and received the priority right to use its products.
AI large language models (LLMs) become «more covertly racist» after human intervention
From the very beginning, it was clear that large-scale language models (LLMs) like ChatGPT absorb racist messages from the millions of Internet pages they are trained on. Developers have responded to this by trying to make them less toxic. But new research shows that these efforts, especially as the models get bigger, only serve to stifle racist…
Spelling error report
The following text will be sent to our editors: