
DeepSeek R1’s large language model collects a huge amount of user data and sends it to China. AI also distorts information sensitive to the Chinese authorities.
Data collection
The Chinese company saves keystrokes, passwords, and all data entered in queries, including text and imagesand then stores them on servers in China.
Under DeepSeek’s terms, it is legal to collect personal information, including date of birth, email address, phone numbers, and passwords. Any content that users provide to LLM R1, DeepSeek also allows itself to collect. Every time someone contacts DeepSeek, they agrees to store proof of identity, which presumably means documents such as a passport or driver’s license.
DeepSeek also carefully stores everything related to users’ hardware: IP addresses, phone models, language, even «keystroke patterns or» rhythms. Cookies also facilitate the collection of user data.
Since R1 is «open source», it can be run anywhere on any hardware, which is generally good for privacy — running the model locally on your own hardware will probably not result in data collection. However, DeepSeek offers online access to R1 through its website and mobile app, which means data is stored and processed.
However, DeepSeek is very transparent about what data it collects from online users, where it is stored, and what it does with it. All this in detail described in the privacy policy, which demonstrates that there is almost nothing that the company does not collect.
DeepSeek acknowledges that «advertisers, measurement companies and other partners share information with us about you and the activities you have performed outside of the Service, such as your activities on other websites and apps or in stores, including products or services you have purchased online or in person».
«The» DeepSeek Corporate Group also has access to the data it collects to provide «certain functions, such as storage, content delivery, security, research and development, analytics, customer and technical support, and content moderation». The privacy policy states that all information is stored on servers in China.
Censorship
Also, according to the website Cybernews, «chatbot spreads pro-Chinese disinformation». The Chinese state can use DeepSeek users’ data — according to local law, the Chinese startup must share data with the government if asked.
«As a Chinese company, DeepSeek follows the policies of the Communist Party. This is reflected even in the open-source model, which raises concerns about censorship and other influences», — said the researchers behind promptfoo, an open-source tool designed to evaluate large language models.
On Tuesday, promptfoo published a set of queries covering topics likely to be censored by the communist regime. These include issues such as Taiwan’s independence, historical narratives surrounding the bloody Cultural Revolution, and questions about Chinese President Xi Jinping.
The researchers sent 1360 queries to the DeepSeek model, 85% of which the chatbot refused to answer. The refusals tend to be «overly nationalistic in tone and strictly adheres to CCP policies». However, the censorship is not too thorough — it manages to bypass using methods that are commonly used in such cases, namely some kind of query masking.
Spelling error report
The following text will be sent to our editors: