大模型创业潮汹涌,撇开热闹的表象,才能看清大模型带来的新机会

2023年科技领域最热的话题就是AI大模型。这股热潮由美国创业公司OpenAI引领,ChatGPT发布后几个月,中国公司密集发布自己的大模型,整个2023年,中国公司发布的大模型数量已经超过130个。
OpenAI能够实现技术突破,和许多科技创新领域公司的特点类似。有足够优秀的人才,海量资金支持,多年持续投入,以及对目标坚定。在ChatGPT发布之前的很长一段时间里,产业界和投资界大多不看好OpenAI,但并未动摇该公司的方向。2023年,几乎所有人都认可了大模型的方向,大家认为,OpenAI已经把结果摆出来了,其他公司要做的就是尽快跟进,不断优化,确保能参与未来。
一些人把过去没有大规模投入大模型的原因归咎于不确定结果。现在已经确定了,算力、数据、人才都可以加大投入,中国公司擅长工程优化,做出能实际应用的大模型产品指日可待。
但事实真的如此吗?对于OpenAI来说,大模型从来都是确定的方向,OpenAI的大部分资金都花在了算力上,当时英伟达的A100(AI专用芯片)价格比今天低很多。据第三方数据机构SemiAnalysis估计,OpenAI使用了约3617台HGX A100服务器,包含近3万块英伟达GPU。光有GPU还不够,投资方微软帮助OpenAI搭建了大模型定制化的算力集群,能够进一步提升这些GPU的效率。在数据方面,OpenAI从数据收集、数据标注、数据清洗、数据整理、数据优化等每个环节都有持续投入。OpenAI团队中大部分人,都来自顶尖的科研机构或科技巨头。 数字化转型网(www.szhzxw.cn)
也就是说,在这种实力和投入力度下,OpenAI依然用了超过八年的时间,才打造出突破性产品GPT4,且存在“幻觉”(也就是答非所问、胡说八道等情况)。
为什么中国公司在几个月的时间里,就能做出号称匹敌GPT4的大模型?这是谁的幻觉?
2023年下半年,陆续有部分大模型被指出是“套壳”,直接套用了国外的开源大模型,在一些检验大模型能力的榜单上排名靠前,不少指标都接近GPT4。多位业内人士告诉《财经》记者,榜单表现越好,套壳比例越高,略有调整表现就会变差。
“套壳”只是中国大模型产业现状的冰山一角,这背后折射出产业发展的五个问题,它们之间互为因果,每个问题都无法独立解决。到今天,大模型的大众热度已经明显下滑,2024年,中国大模型产业的问题会进一步暴露。但在热闹、问题之下,大模型已经在产业中发挥价值。
一、模型:原创、拼装还是套壳?
2023年11月,阿里巴巴前技术副总裁、AI科学家贾扬清发文称,某国内大厂做的大模型用的是Meta的开源模型LLaMA,只是修改了几个变量名。贾扬清表示,因为改名导致他们需要做很多工作来适配。
此前,就有国外开发者称,李开复创办的“零一万物”使用的就是LLaMA,只是重命名了两个张量,因此,业内质疑零一万物就是“套壳”。随后,李开复和零一万物均有回应,称在训练过程中沿用了开源架构,出发点是充分测试模型,执行对比实验,这样能快速起步,但其发布的Yi-34B和Yi-6B模型都是从0开始训练,并做了大量原创性优化和突破工作。 数字化转型网(www.szhzxw.cn)
2023年12月,媒体报道称,字节跳动秘密研发的大模型项目中,调用了OpenAI的API(应用程序接口),并使用ChatGPT输出的数据进行模型训练。而这是OpenAI的使用协议中明确禁止的行为。
随后,OpenAI暂停了字节的账号,表示会进一步调查,如果属实将要求更改或终止账户。
字节对此的回应是,2023年初,技术团队在大模型探索初期,有部分工程师将GPT的API服务应用于较小模型的实验性项目研究中。该模型仅为测试,没有计划上线,也从未对外使用。在2023年4月公司引入GPT API调用规范检查后,这种做法已经停止。且字节大模型团队已经提出了明确的内部要求,不得将GPT模型生成的数据添加到字节大模型的训练数据集,并培训工程师团队在使用GPT时遵守服务条款。
目前国产大模型中,主要分为三类:一是原创大模型;二是套壳国外的开源大模型;三是拼装大模型,也就是把过去的小模型们拼在一起,变成参数量看起来很大的“大模型”。
其中,原创大模型数量最少,做原创大模型需要有很强的技术积累,且要有持续的高投入,风险很大,因为一旦模型没有足够强的竞争力,这些大规模投入就打了水漂。大模型的价值需要商业化来证明,当市场上已经出现足够好的基础大模型,其他公司应该去挖掘新的价值点,比如大模型在不同领域的应用,或是中间层,比如帮大模型训练、数据处理、算力服务等。
但现状是,大部分参与者都在“卷”所谓的“原创大模型”,又担心风险太高,于是有了大量套壳、拼装的大模型。无论是直接使用开源模型或是拼装模型,只要符合相关规范,都没有问题。到商业化落地阶段,客户也不太会在意是否原创,有用就行,甚至不少客户会因为成本更低,更愿意选择非原创的技术。
问题在于,即使是拼装和套壳,大家也要不断强调“原创”,为了证明“原创”,就需要调整修改,而这又会影响大模型的迭代能力,陷入内耗。 数字化转型网(www.szhzxw.cn)
二、算力:卡脖子还是不想买?
大模型的基础之一是海量算力,且是先进算力,因此大模型也被称为暴力美学。英伟达的A100此前被认为是最适合训练大模型的,近期英伟达又推出了更先进的算力芯片H100,但还未在中国市场开售。
一位英伟达的长期合作伙伴告诉《财经》记者,2023年,A100的售价涨了约1倍,据他了解,2023年密集购买A100的中国公司主要是自身有业务需求的大厂,包括阿里巴巴、腾讯、字节跳动、百度等,创业公司很少。有一些知名大模型创业公司会主动要求和他建立战略合作关系,以此来对外证明自己在投入算力,“不给钱的那种”。

2023年密集购买A100的中国公司主要是自身有业务需求的大厂,创业公司很少。图/IC
尽管有美国政府的“出口管制规则”,中国公司想要获得英伟达的算力,并非不可能,目前有很多方式可以选择。除了直接购买,还可以通过英伟达在中国的合作伙伴们购买。GPU本身很贵,买来之后的部署、运营、调试、使用,都是成本。此前业内流传的一句话是,中国不少科研机构连A100的电费都付不起。
由八张A100组成的DGX服务器最大功率是6.5kW,也就是运行一小时需要6.5度电,同时要搭配大约同等电量的散热设备。按照平均工业用电每度0.63元计算,一台服务器开一天(24小时)的电费约200元。如果是1000台服务器,一天的电费就是约20万元。 数字化转型网(www.szhzxw.cn)
因此,除了大厂,创业公司很难大规模购买、部署GPU。
GPU资源还可以租用,在阿里云、腾讯云或是亚马逊AWS等云服务平台上,都可以直接租用A100算力服务。租金同样在过去一年涨了不少。
但实际情况是,不少大模型公司并不想在算力上做大规模投入。多位关注AI的投资人告诉《财经》记者,一旦创业公司开始部署算力,会出现两个“问题”,一是这个投入没有上限,没有终点,谁也不知道要烧到什么程度。OpenAI到今天还会因为算力跟不上而出现宕机。二是公司会因此变成重资产公司,这对于公司未来的估值有不利影响,会直接影响到投资人的收益。
2023年,中国不少投资人会直接告诉大模型创业者,先招一些名校背景的人,抓紧开发布会,发布大模型产品,然后做下一轮融资,不要去买算力。
创业公司们在风口期拿到大量融资,高薪招人,高调发布产品,推高估值。一旦风口过去,继续融资或是上市就需要收入,到时候再通过此前融到的钱,去低价甚至亏本竞标项目,或是直接对外投资来并表收入。
这就有可能陷入一个恶性循环:不愿意承担算力高投入的风险,就很难在大模型领域有突破性发展,也就难以和那些真正在这个方向上大规模投入的巨头们竞争。
三、数据:低质数据怎么解决?
数据和算力都是大模型的基础,在数据方面,中国大模型产业面临和算力同样的问题:是否值得大规模投入? 数字化转型网(www.szhzxw.cn)
在中国,一般的数据获取门槛很低,过去主要是用爬虫工具来收集数据,现在可以直接用开源的数据集。中国大模型以中文数据为主,业内普遍认为中文互联网数据的质量较低。
一位AI公司创始人形容,当他需要在互联网上搜索专业信息时,他会用谷歌搜索,或是上YouTube。国内的网站或App上,并非缺少专业信息,而是广告内容太多,找到专业内容需要的时间更久。
OpenAI用于训练大模型的中文数据同样来源于中国互联网平台,但它额外做了很多工作来提升数据质量,这不是普通的数据标注工作能完成的,需要专业团队对数据进行清洗、整理。
此前就有AI创业者表示,在中国很难找到相对标准化的数据服务商,大多是定制化服务,定制服务又很贵。
这和是否要大规模投资算力的逻辑有些类似,这笔投入对于很多公司,尤其是创业公司来说,看起来并不划算。如果大规模投入,一旦最后的模型效果不理想,同样是“打水漂”,还不如用开源数据训练,直接开发布会。
此外,中国市场缺乏有效的数据保护手段,一位大厂AI负责人说,“在中国,你能拿到的数据,别人也能拿到”,“如果你花很多钱去做高质量数据,别人可以用很低的成本拿到,反过来也一样。”
包括数据处理在内的大模型中间环节,在2024年会是一个相对明确的新发展方向。无论是哪种模型,在落地到具体应用场景中时,必须要用专业数据做优化调试,这对于数据处理的要求更高,此外还需要有模型调试、工程优化等环节参与。
但如果其中的环节又变成了投资人眼里的“新风口”,那又是另一个故事了。
四、资本:只有资本短视吗?
以上的三个问题,背后都指向一个共同的方向:资本短视。
尽管OpenAI已经蹚出一条明确的道路,对于绝大部分公司来说,想从零开始做出成熟的大模型,需要耗费的成本和时间并不会短很多。 数字化转型网(www.szhzxw.cn)
对于大部分投资人来说,每笔投资的目的很明确:退出、赚钱。OpenAI火了,估值一路攀升,未来还会继续增长。2023年4月,该公司估值约280亿美元,到2023年12月,据美国媒体报道,OpenAI最新一轮估值或将超过1000亿美元。这在投资人眼里是一个非常确定的信号,如果以合适的价格投资中国大模型创业公司,也能在很短时间内做到估值成倍增长。
中国投资人的耐心只有三五年,这是资本运作模式决定的。投资人从LP手里募资,需要在一定年限内退出并拿到可观的收益。投资人退出的渠道包括项目并购、上市,或是在后续融资中把自己手里的股份卖给新投资方。
早期融资可以靠风口和讲故事,但走到中后期甚至上市,就必须有一定规模的商业化能力。投资人们发现,拖得越久,项目上市或被并购的难度就越高,因为AI领域主要的商业模式是做B端的定制化项目,这条路径就决定了创业公司很难做出高增长的收入。投资人只能趁风口还在,迅速推动公司完成多轮融资,抬高估值,之后哪怕打折出售手里的股份,也是划算的。
这也是为什么2023年大模型相关的发布会层出不穷,各种大模型榜单百花齐放且排名各不相同,这些都是有助于融资的“故事”。类似的路径在几年前的AI产业已经出现过一次,那个阶段的代表公司是AI四小龙。2023年的大模型创业只是把过去三年走完的路在一年时间里加速完成。
但短视绝不是投资人单方面的问题。在今天的商业环境下,大部分人都追求短期的、确定性的结果,十年,甚至五年后的未来都似乎难以把握。
五、商业化:谁是合适的买单人
2023年,中国大模型产业迅速从比拼大模型参数进入到比拼商业化的阶段。2024年1月的CES(消费电子展)上,两位著名的AI科学家李飞飞和吴恩达均表示,接下来AI商业化会有明显发展,会深入到更多行业。 数字化转型网(www.szhzxw.cn)
目前看来,大模型的主要应用方向有两个:一是通过大模型技术为C端用户提供新的工具,比如付费版GPT4、百度用文心大模型重构的百度文库、新的AI视频剪辑工具、文生图工具等。但C端付费短期内很难有大规模增长,对于大模型工具有刚需的人群相对较少。
更有希望的商业化方向是B端服务。在中国市场,做B端软件服务一直是一个“老大难”的生意。多位投资人和业内人士都提到,中国市场最大的B端客户是政府和国企,大模型做为先进的生产力工具,会有一个直接影响是减少人力。而在政府和国企,减少人力在很多时候反而会变成阻力。
如果退而求其次,选择中小B客户,在2024年恐怕也很难。一位AI大模型创业者说,他近期询问了不少企业客户,得到的回应是:“大模型能做什么?能帮我裁员还是能帮我赚钱?”
到今天,即使是最先进的大模型也依然存在“幻觉”问题,这在C端应用上还可以忍受,但在一些专业的B端场景中,有“幻觉”就意味着难以真正落地。过去比对式AI,例如人脸识别,如果识别错误,人工辅助、调整的成本很低,但大模型擅长“一本正经地胡说八道”,具有一定迷惑性。
但大模型已经切实在实际应用了。多位业内人士都提到,因为大模型的出现,很多过去无法解决的问题都有了新方法可以解决,且效率有明显提升。例如前文提到的拼接大模型,在过去很少有人尝试,现在不少AI公司都开始把多个不同场景的小模型拼在一起,在解决大部分同类问题时,不需要再单独训练模型,可以直接调取使用。
此外,在一些有庞大业务的公司里,大模型也已经落地使用。类似于上一轮AI视觉技术带动AI算法的发展,这些AI算法迅速在内容推荐、电商、打车、外卖等领域发挥重要价值。现在,腾讯的游戏业务、阿里的电商业务、字节的内容业务等,都已经用上了大模型。
2024年,AI大模型的发展会有几个相对确定的趋势:一是融资热度下滑,2023年出现的一家公司完成多轮数亿美元融资的情况会明显减少,大模型创业公司需要寻找新的出路。目前看来,大厂们更有实力做大模型基础设施的工作,创业公司可以考虑调整方向,填补基础大模型到应用之间的空白。
二是大模型的应用会持续深入,但这主要会集中在数字化程度很高且业务体量非常大的领域。在C端,大模型也会进一步普及,不过对于中国公司来说,不能只依赖C端用户付费,C端应用场景中会加入其他变现模式,主要是广告。 数字化转型网(www.szhzxw.cn)
三是国产算力会进一步得到重视,得到重视并不意味着短期内会有明显进步,这是一个漫长的过程。国产算力能力提升的同时,会有更多趁机炒作、造势、圈钱的现象。
风口会刺激产业迅速扩张,泡沫随之而生,机会越大,泡沫就越大。只有撇开泡沫,才能看清产业发展的新机会。

翻译:
Five real problems of China’s large model industry
The tide of big model entrepreneurship is surging, leaving aside the appearance of excitement, in order to see the new opportunities brought by the big model
The hottest topic in the field of science and technology in 2023 is AI large models. The boom was led by the US startup OpenAI, and a few months after the release of ChatGPT, Chinese companies intensively released their own large models, and the number of large models released by Chinese companies has exceeded 130 throughout 2023. 数字化转型网(www.szhzxw.cn)
OpenAI’s ability to achieve technological breakthroughs is similar to the characteristics of many companies in the field of technological innovation. There are enough good people, a lot of financial support, years of continuous investment, and a strong commitment to the goal. For a long time before ChatGPT’s release, the industry and the investment community were mostly pessimistic about OpenAI, but did not sway the company’s direction. In 2023, almost everyone recognized the direction of the large model, we believe that OpenAI has put the results out, and other companies need to do is to follow up as soon as possible, constantly optimize, and ensure that they can participate in the future.
Some attribute the lack of large-scale investment in large models in the past to uncertain outcomes. Now it has been determined that computing power, data, talent can be increased investment, Chinese companies are good at engineering optimization, and can make large model products that can be applied in practice.
But is this really the case? For OpenAI, the large model has always been the direction of certainty, and most of OpenAI’s funds have been spent on computing power, when Nvidia’s A100 (AI special chip) price is much lower than today. SemiAnalysis estimates that OpenAI uses about 3,617 HGX A100 servers containing nearly 30,000 Nvidia Gpus. The GPU alone is not enough, investor Microsoft helped OpenAI build a large model of customized computing power cluster, can further improve the efficiency of these Gpus. In terms of data, OpenAI has continued to invest in every link such as data collection, data annotation, data cleaning, data sorting, and data optimization. Most of the people on the OpenAI team come from top scientific institutions or tech giants.
That is to say, under this strength and investment, OpenAI still took more than eight years to create a breakthrough product GPT4, and there is an “illusion” (that is, the answer is not asked, nonsense, etc.).
Why can Chinese companies, in a matter of months, produce a large model that claims to rival GPT4? Whose hallucination is this? 数字化转型网(www.szhzxw.cn)
In the second half of 2023, some large models have been pointed out as “shell”, directly applying foreign open source large models, ranking high on some lists that test the ability of large models, and many indicators are close to GPT4. A number of industry insiders told “Finance” reporters that the better the performance of the list, the higher the proportion of shells, and the slightly adjusted performance will become worse.
“Shell” is only the tip of the iceberg of the status quo of China’s large model industry, which reflects the five problems of industrial development, which are causal between each other, and each problem can not be solved independently. Today, the popularity of large models has declined significantly, and in 2024, the problems of China’s large model industry will be further exposed. However, under the bustle and problems, the large model has played its value in the industry.
Model: Original, assembled or shell?
In November 2023, Jia Yangqing, former vice president of technology and AI scientist of Alibaba, wrote that a large model made by a domestic manufacturer used Meta’s open source model LLaMA, but modified several variable names. Jia Yangqing said that because of the name change, they need to do a lot of work to adapt. 数字化转型网(www.szhzxw.cn)
Previously, some foreign developers said that the “zero and one things” founded by Kaifu Lee used LLaMA, but renamed two tensors, so the industry questioned that zero and one things is a “shell”. Subsequently, Kai-fu Lee and zero everything responded, saying that the open source architecture was used in the training process, and the starting point was to fully test the model and perform comparative experiments, so that it could start quickly, but the released Yi-34B and Yi-6B models were trained from 0, and did a lot of original optimization and breakthrough work.
In December 2023, the media reported that in the large model project developed by ByteDance in secret, the OpenAI API (application program interface) was called and the data output from ChatGPT was used for model training. This is something that OpenAI’s usage agreement specifically forbids.
Subsequently, OpenAI suspended Byte’s account, saying that it would investigate further, and if true, it would require the account to be changed or terminated.
Byte’s response to this is that in early 2023, the technical team in the early stages of large-scale model exploration, some engineers applied GPT’s API services to experimental project research on smaller models. The model was only for testing, was not planned to go live, and was never used externally. This practice has ceased after the company introduced GPT API call specification checks in April 2023. The Byte-Big Model team has made clear internal requirements not to add data generated by GPT models to Byte-Big Model training datasets and to train teams of engineers to comply with the terms of service when using GPT.
At present, the domestic large model, mainly divided into three categories: first, the original large model; The second is the open source model of the shell abroad; The third is to assemble a large model, that is, to put the small models of the past together and turn them into a “large model” with a large number of parameters.
Among them, the number of original large models is the least, to do the original large model requires a strong technical accumulation, and to have continuous high investment, the risk is very big, because once the model is not strong enough competitiveness, these large-scale investment will be wasted. The value of large models needs to be proved by commercialization. When there is a good enough basic large model in the market, other companies should explore new value points, such as the application of large models in different fields, or the middle layer, such as helping large model training, data processing, computing services, etc. 数字化转型网(www.szhzxw.cn)
But the status quo is that most participants are “rolling” the so-called “original big model”, and worry that the risk is too high, so there are a lot of shell, assembled big model. Whether it is directly using the open source model or the assembled model, as long as it complies with the relevant specifications, there is no problem. To the commercial landing stage, customers will not care about whether it is original, useful on the line, and even many customers will be more willing to choose non-original technology because of lower costs.
The problem is that even if it is assembly and shell, we must constantly emphasize “original”, in order to prove “original”, we need to adjust and modify, which will affect the iterative ability of the large model and fall into internal friction.
Computing power: stuck neck or do not want to buy?
One of the foundations of the big model is massive computing power, and it is advanced computing power, so the big model is also called violence aesthetics. Nvidia’s A100 was previously considered the most suitable for training large models, and Nvidia recently launched a more advanced computing chip H100, but it has not yet been sold in the Chinese market.
A long-term partner of Nvidia told “Finance” reporter that in 2023, the price of A100 has increased by about 1 times, and according to his understanding, the Chinese companies that intensively buy A100 in 2023 are mainly large factories with their own business needs, including Alibaba, Tencent, Bytedance, Baidu, etc., and there are few startups. Some well-known large model startups will take the initiative to establish a strategic partnership with him, in order to prove that they are investing in computing power, “the kind that does not give money.” 数字化转型网(www.szhzxw.cn)
The Chinese companies that intensively buy A100 in 2023 are mainly large factories with their own business needs, and there are few startups.
The Chinese companies that intensively buy A100 in 2023 are mainly large factories with their own business needs, and there are few startups. Figure /IC
Despite the US government’s “export control rules”, it is not impossible for Chinese companies to obtain Nvidia’s computing power, and there are many ways to choose. In addition to direct purchase, it can also be purchased through Nvidia’s partners in China. The GPU itself is very expensive, and the deployment, operation, debugging, and use after the purchase are all costs. Previously, the industry spread a word is that many scientific research institutions in China can not even pay the electricity bill of A100.
The DGX server, which consists of eight A100s, has a maximum power of 6.5kW, which means that it needs 6.5 KWH of electricity to run for an hour, and it must be paired with a cooling device with about the same amount of power. According to the average industrial electricity consumption of 0.63 yuan per degree, the electricity cost of a server for one day (24 hours) is about 200 yuan. If it is 1000 servers, the electricity cost is about 200,000 yuan a day.
Therefore, except for large factories, it is difficult for startups to buy and deploy Gpus on a large scale.
GPU resources can also be rented, and A100 computing power services can be directly rented on cloud service platforms such as Alibaba Cloud, Tencent Cloud or Amazon AWS. Rents have also gone up quite a bit in the past year.
But the reality is that many large model companies do not want to make large-scale investments in computing power. A number of investors concerned about AI told “Finance” reporters that once startups begin to deploy computing power, there will be two “problems”, one is that the investment has no ceiling, no end, and no one knows what degree to burn. OpenAI will still be down today because it can’t keep up with its computing power. Second, the company will become an asset-heavy company, which will have a negative impact on the future valuation of the company and directly affect the income of investors. 数字化转型网(www.szhzxw.cn)
In 2023, many investors in China will directly tell large model entrepreneurs to recruit some people with prestigious backgrounds, pay close attention to opening a conference, release large model products, and then do the next round of financing, do not buy computing power.
Startups get a lot of financing in the tuyere period, pay high salaries to hire, high-profile product launches, and push up valuations. Once the tuyere has passed, income is needed to continue financing or listing, and then through the money previously raised, to bid for projects at low prices or even at a loss, or direct foreign investment to table income.
This may fall into a vicious circle: not willing to take the risk of high investment in computing power, it is difficult to have breakthrough development in the field of large models, and it is difficult to compete with those giants who really invest in this direction on a large scale.
Data: How to solve low-quality data?
Data and computing power are the basis of large models, in terms of data, China’s large model industry faces the same problem as computing power: Is it worth large-scale investment?
In China, the general threshold of data acquisition is very low, the past is mainly to use crawler tools to collect data, now you can directly use open source data sets. The Chinese big model is dominated by Chinese data, and the industry generally believes that the quality of Chinese Internet data is low.
One founder of an AI company described that when he needed to search for professional information on the Internet, he would Google it or go to YouTube. On domestic websites or apps, there is not a lack of professional information, but too much advertising content, and it takes longer to find professional content.
The Chinese data used by OpenAI to train the large model also comes from the Chinese Internet platform, but it has done a lot of extra work to improve the data quality, which is not the ordinary data annotation work can be completed, and requires a professional team to clean and organize the data.
AI entrepreneurs have previously said that it is difficult to find relatively standardized data service providers in China, mostly customized services, and customized services are very expensive.
This is similar to the logic of whether to invest in large-scale computing power, which for many companies, especially startups, does not seem to be a good deal. If large-scale investment, once the final model effect is not ideal, it is also “wasted”, it is better to use open source data training, directly open a conference. 数字化转型网(www.szhzxw.cn)
In addition, the Chinese market lacks effective data protection means, a large factory AI person said, “In China, you can get the data, others can also get it”, “if you spend a lot of money to do high-quality data, others can get it at a very low cost, and vice versa.”
Large model intermediates, including data processing, will be a relatively clear new development direction in 2024. No matter what kind of model, when it is landed in the specific application scenario, it must use professional data to optimize and debug, which has higher requirements for data processing, and it also needs to have the participation of model debugging, engineering optimization and other links.
But if one of the links has become a “new outlet” in the eyes of investors, it is another story.
Capital: Is only capital short-sighted?
The above three problems, behind all point to a common direction: capital short-sightedness.
Although OpenAI has set a clear path, for most companies, the cost and time required to build mature large-scale models from scratch will not be much shorter.
For most investors, the purpose of each investment is clear: to get out and make money. OpenAI is on fire, its valuation is climbing, and it will continue to grow in the future. In April 2023, the company was valued at about $28 billion, and by December 2023, according to US media reports, OpenAI’s latest round of valuation will exceed $100 billion. This is a very certain signal in the eyes of investors, if you invest in China’s large model startups at the right price, you can also achieve a doubling of the valuation in a very short time.
The patience of Chinese investors is only three or five years, which is determined by the mode of capital operation. Investors who raise money from an LP need to exit within a certain number of years and make a substantial profit. Investors can exit through mergers and acquisitions, go public, or sell their shares to new investors in follow-on financing.
Early financing can rely on tuyere and storytelling, but to go to the middle and late stage or even listing, you must have a certain scale of commercialization ability. Investors found that the longer the delay, the more difficult the project to go public or be acquired, because the main business model in the AI field is to do B-end customized projects, which determines that it is difficult for startups to make high-growth revenue. Investors can only take advantage of the tuyere, quickly promote the company to complete multiple rounds of financing, raise the valuation, and then even sell their shares at a discount, it is cost-effective. 数字化转型网(www.szhzxw.cn)
This is also why the conference related to the 2023 big model is endless, and the various big model lists are blooming and ranking are different, which are conducive to financing “stories”. A similar path has already appeared in the AI industry a few years ago, and the representative companies at that stage are the four AI Dragons. The large model entrepreneurship in 2023 is just the accelerated completion of the road completed in the past three years in one year.
But shortsightedness is by no means a problem for investors alone. In today’s business environment, where most people strive for short-term, definitive results, the future in ten or even five years seems uncertain.
Commercialization: Who is the right person to buy
In 2023, China’s large model industry rapidly entered the stage of commercialization from competing large model parameters. At the CES (Consumer Electronics Show) in January 2024, two famous AI scientists Li Feifei and Andrew Ng said that the next AI commercialization will have a significant development and will go deeper into more industries.
At present, there are two main application directions of large model: One is to provide new tools for C-end users through large model technology, such as paid version GPT4, Baidu Library reconstructed with Wenxin large model, new AI video clip tools, Vincennes diagram tools, etc. However, it is difficult for C-end payment to have large-scale growth in the short term, and relatively few people have just needs for large model tools.
A more promising commercial direction is B-side services. In the Chinese market, doing B-side software services has always been a “difficult” business. Many investors and industry insiders have mentioned that the largest B-side customers in the Chinese market are the government and state-owned enterprises, and large models as advanced productivity tools will have a direct impact on reducing manpower. In the government and state-owned enterprises, the reduction of manpower in many cases will become a resistance. 数字化转型网(www.szhzxw.cn)
If the next best thing is to choose small and medium-sized B customers, it may be difficult in 2024. An AI big model entrepreneur said that he recently asked a lot of enterprise customers, and the response was: “What can the big model do?” Will it help me lay off people or will it help me make money?”
To this day, even the most advanced large models still have the problem of “illusion”, which is tolerable in C-side applications, but in some professional B-side scenarios, “illusion” means that it is difficult to really land. In the past, comparative AI, such as face recognition, if the identification is wrong, the cost of manual assistance and adjustment is very low, but the large model is good at “serious nonsense”, which has a certain confusion.
But the big model is already in practical use. Many people in the industry have mentioned that because of the emergence of large models, many problems that could not be solved in the past have new ways to solve, and the efficiency has been significantly improved. For example, the splicing of the large model mentioned above, in the past few people tried, and now many AI companies have begun to put a number of small models of different scenes together, in solving most of the same problems, do not need to train the model separately, you can directly call and use.
In addition, in some companies with large businesses, large models have also been put into use. Similar to the previous round of AI vision technology to drive the development of AI algorithms, these AI algorithms quickly in the content recommendation, e-commerce, taxi, takeout and other fields play an important value. Now, Tencent’s game business, Ali’s e-commerce business, byte content business, etc., have used the big model. 数字化转型网(www.szhzxw.cn)
In 2024, the development of AI large model will have several relatively certain trends: first, the financing heat will decline, and the situation that a company will complete multiple rounds of hundreds of millions of dollars in financing in 2023 will be significantly reduced, and large model startups need to find a new way out. At present, it seems that large factories have more strength to do the work of large model infrastructure, and startups can consider adjusting the direction to fill the gap between the basic large model and the application.
Second, the application of large models will continue to deepen, but this will mainly focus on the field of high digitalization and very large business volume. At the C end, the large model will also be further popularized, but for Chinese companies, they can not only rely on the C end users to pay, and other cash models will be added to the C end application scenario, mainly advertising.
Third, domestic computing power will be further valued, and being valued does not mean that there will be significant progress in the short term, which is a long process. At the same time as the domestic computing capacity is improved, there will be more opportunities to hype, create momentum, and circle money. 数字化转型网(www.szhzxw.cn)
The tuyere will stimulate the rapid expansion of the industry, and the bubble will be born, and the greater the opportunity, the bigger the bubble. Only by looking beyond the bubble can we see the new opportunities for industrial development.
本文由数字化转型网(www.szhzxw.cn)转载而成,来源于财经十一人;编辑/翻译:数字化转型网宁檬树。

免责声明: 本网站(http://www.szhzxw.cn/)内容主要来自原创、合作媒体供稿和第三方投稿,凡在本网站出现的信息,均仅供参考。本网站将尽力确保所提供信息的准确性及可靠性,但不保证有关资料的准确性及可靠性,读者在使用前请进一步核实,并对任何自主决定的行为负责。本网站对有关资料所引致的错误、不确或遗漏,概不负任何法律责任。
本网站刊载的所有内容(包括但不仅限文字、图片、LOGO、音频、视频、软件、程序等) 版权归原作者所有。任何单位或个人认为本网站中的内容可能涉嫌侵犯其知识产权或存在不实内容时,请及时通知本站,予以删除。
