What do you think about new AI chatbot : DeepSeek

What do you think about new AI chatbot : DeepSeek

Size
Price: £2.99

Read more

cheap, powerful Chinese AI for all called DeepSeek.The AI startup has upended the industry by developing a model that costs much less to produce – and is available free to a universe of tinkerers
Click here to see video The release of China's new DeepSeek AI-powered chatbot app has rocked the technology industry. It quickly overtook OpenAI's ChatGPT as the most-downloaded free iOS app in the US, and caused chip-making company Nvidia to lose almost $600bn (£483bn) of its market value in one day – a new US stock market record. The reason behind this tumult? The "large language model" (LLM) that powers the app has reasoning capabilities that are comparable to US models such as OpenAI's o1, but reportedly requires a fraction of the cost to train and run. DeepSeek claims to have achieved this by deploying several technical strategies that reduced both the amount of computation time required to train its model (called R1) and the amount of memory needed to store it. The reduction of these overheads resulted in a dramatic cutting of cost, says DeepSeek. R1's base model V3 reportedly required 2.788 million hours to train (running across many graphical processing units – GPUs – at the same time), at an estimated cost of under $6m (£4.8m), compared to the more than $100m (£80m). Despite the hit taken to Nvidia's market value, the DeepSeek models were trained on around 2,000 Nvidia H800 GPUs, according to one research paper released by the company. These chips are a modified version of the widely used H100 chip, built to comply with export rules to China. These were likely stockpiled before restrictions were further tightened by the Biden administration in October 2023, which effectively banned Nvidia from exporting the H800s to China. It is likely that, working within these constraints, DeepSeek has been forced to find innovative ways to make the most effective use of the resources it has at its disposal.
Reducing the computational cost of training and running models may also address concerns about the environmental impacts of AI. The data centres they run on have huge electricity and water demands, largely to keep the servers from overheating. While most technology companies do not disclose the carbon footprint involved in operating their models, a recent estimate puts ChatGPT's monthly carbon dioxide emissions at over 260 tonnes per month – that's the equivalent of 260 flights from London to New York. So, increasing the efficiency of AI models would be a positive direction for the industry from an environmental point of view. Historical resonances were rife. Andreessen was referring to the seminal moment in 1957 when the Soviet Union launched the first Earth satellite, thereby displaying technological superiority over the US – a shock that triggered the creation of Nasa and, ultimately, the internet. Other people were reminded of the advent of the “personal computer” and the ridicule heaped upon it by the then giants of the computing world, led by IBM and other purveyors of huge mainframe computers. Suddenly, people are beginning to wonder if DeepSeek and its offspring will do to the trillion-dollar AI behemoths of Google, Microsoft, OpenAI et al what the PC did to IBM and its ilk. And of course there are the conspiracy theorists wondering whether DeepSeek is really just a disruptive stunt dreamed up by Xi Jinping to unhinge the US tech industry. Is the model really that cheap to train? Can we believe the numbers in the technical reports published by its makers? And so on. Standing back, there are four things to take away from the arrival of DeepSeek. The first is that China has caught up with the leading US AI labs, despite the widespread (and hubristic) western assumption that the Chinese are not as good at software as we are. Even a cursory examination of some of the technical details of R1 and the V3 model that lay behind it evinces formidable technical ingenuity and creativity. Second, the low training and inference costs of R1 will turbocharge American anxiety that the emergence of powerful – and cheap – Chinese AI could upend the economics of the industry, much as the advent of the PC transformed the computing marketplace in the 1980s and 90s. What the advent of DeepSeek indicates is that this technology – like all digital technology – will eventually be commoditised. R1 runs on my laptop without any interaction with the cloud, for example, and soon models like it will run on our phones. Third, DeepSeek pulled this off despite the ferocious technology bans imposed by the first Trump administration and then by Biden’s. The company’s technical report shows that it possesses a cluster of 2,048 Nvidia H800 GPUs – technology officially banned by the US government for sale to China. And last, but by no means least, R1 seems to be a genuinely open source model. It’s distributed under the permissive MIT licence, which allows anyone to use, modify, and commercialise the model without restrictions. As I write this, my hunch is that geeks across the world are already tinkering with, and adapting, R1 for their own particular needs and purposes, in the process creating applications that even the makers of the model couldn’t have envisaged. It goes without saying that this has its upsides and downsides, but it’s happening. The AI genie is now really out of the bottle.

0 Reviews

Contact Form

Name

Email *

Message *