Tag: Natural Language Model

Microsoft announces Phi-140, a lightweight natural language model with 4 billion parameters

Microsoft announces Phi-140, a lightweight natural language model with 4 billion parameters

Following updates to its Phi natural language processing model, Microsoft announced the release of Phi-4, a new lightweight natural language processing model with 14 billion parameters, boasting performance comparable to large-scale natural language models like Llama. Phi-4 emphasizes its ability to improve mathematical reasoning and language understanding with a limited parameter scale, even surpassing the 70 billion parameter version of Llama in mathematical processing performance. Phi-4 achieves significantly improved inference accuracy through high-quality synthetic data combined with carefully selected and processed real-world data, and through retuned post-training techniques. It also improves operational efficiency with a smaller parameter scale and can be easily deployed on a wider range of devices. Microsoft has already made Phi-4 available through the Azure AI Foundry platform and will later host it through the Hugging Face platform, aiming to attract more enterprises and developers and broaden its application in various computing scenarios.

Microsoft Taiwan, Huadian Networks, and Professor Tsai Tsung-han jointly created the "Traditional Chinese Textbook AI Assistant" officially launched

Microsoft Taiwan, Huadian Networks, and Professor Tsai Tsung-han jointly created the "Traditional Chinese Textbook AI Assistant" officially launched

The "Traditional Chinese Language Teaching Material AI Assistant" model, trained with Traditional Chinese materials and conforming to the Taiwan Chinese Language Proficiency Standards issued by the Ministry of Education, is based on the Microsoft Phi-3.5 Natural Language Model. It incorporates all Traditional Chinese language data and teaching materials from the Ministry of Education to enhance its Chinese language and textbook knowledge, and is further fine-tuned using a fine-tuning program designed by Professor Tsai Tsung-han's team from the Department of Computer Science at National Central University. Because large-scale international language models are often market-driven, their training materials are primarily English, and their Chinese materials tend to be simplified Chinese, leading to a disconnect between these models and Taiwanese culture and values. To better align with local Taiwanese culture, Microsoft Taiwan, Huadian Networks, and Professor Tsai Tsung-han collaborated to develop the "Chinese Language Teaching Material AI Assistant" model. This model can automatically generate Traditional Chinese texts, create PowerPoint presentation materials, and generate reading and vocabulary quizzes. This significantly reduces the time Chinese language teachers spend creating Traditional Chinese teaching content, while allowing global Chinese language learners to customize their learning plans based on their background and language proficiency. With the assistance of the AI ​​assistant, the model can rapidly promote the efficiency of global Chinese language teaching and learning. Li-Cheng Shih, Assistant General Counsel of Microsoft and General Manager of Microsoft Taiwan's Public and Legal Affairs Department, stated, "Empowering the digital transformation of education with technology has always been a goal of Microsoft. This collaboration with Huadian Networks and Professor Tsai Tsung-Han of the Department of Computer Science at National Central University to develop the 'Traditional Chinese Language Teaching Material AI Assistant' model aims to contribute to the cultural heritage of Taiwan through Traditional Chinese language learning. We hope this model will not only benefit Chinese language teachers and learners but also establish an AI-driven Traditional Chinese language education ecosystem through various open-source language models, allowing more industry partners to jointly promote Traditional Chinese language education." The "Traditional Chinese Language Teaching Material AI Assistant" model has been developed and integrated into the Microsoft Teams user interface. Chinese language teachers and learners can install it simply by using Microsoft Teams.

Apple further explains the practical changes brought about by the App Store policy adjustments in line with the EU Digital Markets Directive

Apple releases open source natural language model OpenELM, which will be used on terminal devices such as iPhone

Following the release of numerous large-scale natural language processing (NLP) models for AI applications by companies like Google, Microsoft, and Meta, as well as NLP models that can run offline on mobile devices, Apple recently announced its own open-source NLP model, OpenELM, which also boasts on-device compatibility. OpenELM is available for download through the Hugging Face hosting platform and includes four pre-trained versions and four versions optimized for specific instructions, with parameter sizes of 270 million, 450 million, 11 billion, and 3 billion respectively. This is significantly smaller than most NLP models on the market that run on mobile devices with 7 billion parameters, resulting in smoother execution performance. Apple is currently licensing the model by providing model weights, sample code, multiple training checkpoints, model performance data, and related operating instructions. The license does not restrict commercial use or modification. Furthermore, Apple clarified that OpenELM's training data sources include publicly available information from Reddit, Wikipedia, arXiv.org, and other sources, and that it was pre-trained using approximately 1.8 trillion tokens. However, Apple emphasized that the model does not include any security guarantees and therefore may produce inaccurate, harmful, biased, or offensive responses. Last year, Apple announced MLX, a high-performance machine learning framework designed for its chip products. Subsequently, it collaborated with researchers at Columbia University to release the open-source multi-modal large-scale natural language model "Ferret." The newly announced OpenELM is offered to the public as open source and may also be used in its own products, such as iPhones and Macs.

The US and UK governments signed a memorandum of understanding to jointly develop safe testing methods for artificial intelligence technology.

Microsoft proposes Phi-38 Mini, a natural language model with only 3 billion parameters that can run faster on mobile devices

Following numerous tech companies' proposals for large-scale natural language processing (NLP) models that can run on mobile devices, Microsoft researchers have unveiled a smaller NLP model called Phi-3 Mini, operating with only 38 billion parameters. Microsoft researchers stated that Phi-3 Mini has a larger parameter set than the previously released Phi-2, while its performance is comparable to Meta's large-scale NLP model, Llama 2. It is based on the Phi-2 model and trained with rigorously filtered network and synthetic data. Operating with only 38 billion parameters, it can be used in a more compact form on mobile devices. The design of Phi-3 Mini is inspired by children's books, using simpler, more easily understood language to describe complex topics, allowing artificial intelligence to more quickly understand actual execution needs. While Microsoft emphasizes that Phi-3 Mini's performance is comparable to Llama 2, its overall performance still falls short of that of large-scale natural language models that operate in collaboration with networks. However, it surpasses Phi-2, as well as smaller natural language models such as Mistral, Gemma, and Llama-3-In, in mathematical calculations, programming, and academic-related tests. Due to its smaller parameter size, Phi-3 Mini is relatively less adept at handling the breadth of real-time, factual knowledge, but it offers better performance on mobile devices. Microsoft currently offers the Phi-3 Mini model through its Azure cloud service platform and hosting platforms such as Hugging Face and Ollama. It also plans to release Phi-3 Small with 70 billion data sets and Phi-3 Medium with 140 billion parameters.

Apple's earnings report shows record sales of iPhone and services, but sales of Mac, iPad and wearable devices all decline

Apple explains how to quickly train large natural language models for multimodal operations, which are expected to be used in its products and services.

Apple researchers recently published a paper explaining how to quickly train large-scale natural language models capable of multi-modal operation, thereby enabling more flexible AI operating systems. Previous reports indicated that Apple would announce details of its AI technology investments later this year, with progress expected to be revealed at WWDC 2024. The company also plans to integrate numerous AI applications into new operating systems such as iOS 18. While Apple appears to be lagging behind its competitors in AI technology development, the market believes it will catch up much faster. According to Apple CEO Tim Cook, Apple invests $10 billion annually in AI technology applications and integrates them into its products and services. The published paper reveals a faster way to train large-scale natural language models capable of multi-modal operation and plans to enhance existing machine learning applications through methods that ensure user privacy and security. In recent market developments, many companies investing in artificial intelligence training methods must carefully avoid content that could affect user privacy or copyright. Therefore, Apple's application of artificial intelligence technology is expected to place even greater emphasis on avoiding impacts on user privacy.

Microsoft launches Phi-27, a natural language model with 2 billion parameters, for mobile phones and laptops

Microsoft launches Phi-27, a natural language model with 2 billion parameters, for mobile phones and laptops

Following the recent release of Phi-1, a natural language processing model with only 13 billion parameters, Microsoft has announced Phi-2, with 27 billion parameters. Phi-2's performance is comparable to the 70 billion parameter versions of Meta Llama 2 and Mistral, and it can be deployed on various devices. Microsoft emphasizes that Phi-2 is only 38% the size of Llama 2 with 70 billion parameters, but delivers almost identical performance, even surpassing the application performance of Google's recently announced Gemini Nano, and offering faster execution efficiency for multi-step inference. Phi-2 was trained on 96 NVIDIA A100 accelerators in 14 days and has not yet undergone any instruction fine-tuning or manual adjustments. It is primarily intended for use in smartphones and laptops. However, Phi-70 is currently only available for specific research needs and is not yet commercially available.

Google will start using a new version of the Chrome browser extension platform in 2023, which will affect many ad blocking extensions

Google's biggest competitor in artificial intelligence isn't OpenAI, but the open source community.

An internal document from Luke Sernau, a senior software engineer at Google, reveals that Google's real competitors in the field of artificial intelligence (AI) technology development are not OpenAI, but rather the AI ​​application model designs available in the open-source community. Previously, many market observers believed that AI technology, which should have been Google's strength, had been gradually surpassed by companies like OpenAI, and even Microsoft was beginning to encroach on the AI ​​application market previously held by Google, including its established search application market. However, Luke Sernau argues that Google's most direct competitors are not OpenAI or other technology companies, but rather these AI technologies actively developing in the open-source community. He believes that numerous independent researchers and developers in the open-source community have already leveraged their unique AI technologies and natural language models to create unexpected applications, progressing faster than anticipated. These technologies are even available as open source and are rapidly growing through open-source community resources. Therefore, Luke Sernau suggested in an internal document that the focus should be shifted to smaller-scale, more flexible AI technologies, rather than concentrating on building larger natural language models, which could potentially slow down overall technology development. From a long-term perspective, Luke Sernau believes the best approach to AI technology development is to adopt a rapid iterative upgrade strategy, rather than relying on a single large model to drive AI application development.

Alibaba is about to release its own large-scale natural language model and will subsequently launch application models for industries.

Alibaba is about to release its own large-scale natural language model and will subsequently launch application models for industries.

Following announcements from companies like OpenAI and Google of automatically generated artificial intelligence technologies and large-scale natural language models, and Baidu's recent unveiling of its "Wenxin Yiyan" AI chatbot built on its ERNIE semantic understanding platform, reports indicate that Alibaba plans to announce its large-scale natural language model on April 11th and an application model for industry on April 18th. Alibaba is expected to announce this at the 2023 Alibaba Cloud Summit on April 11th, which will include its self-developed ChatGPT voice assistant service. Alibaba insiders subsequently confirmed that they have begun developing a large-scale natural language model and will integrate it with its digital assistant service, Tmall Genie, with plans to deploy it across various Alibaba services. Prior to this, Alibaba had already invested in developing large-scale Chinese language models, and in 2021 announced the M6 ​​multi-model large-scale natural language model with over 10 billion parameters, as well as the PLUG large-scale Chinese language model comparable to GPT-3. Subsequently, the parameter scale was increased to 10 trillion, and more than 10 large-scale language models with over 10 billion parameters were open-sourced. In a previous interview, Alibaba CEO Daniel Zhang stated that cloud computing will be one of Alibaba's core strategies for the future, and the combination of cloud computing and artificial intelligence is gradually becoming mainstream. Obviously, Alibaba will not miss the current trend of automatically generated artificial intelligence technology development, and plans to promote the development of more diverse artificial intelligence technologies through its self-built large-scale natural language model.

Google engineers believe that artificial intelligence built with LaMDA has a "self" that represents the existence of "human beings"

An engineer who advocated for the human side of artificial intelligence was eventually fired by Google for violating confidentiality

Blake Lemoine, the engineer who previously claimed that artificial intelligence built using Google's LaMDA natural language model possessed "self-awareness" and publicly disclosed related information, was ultimately fired by Google after being forced to take paid leave for violating confidentiality principles. In a subsequent interview, Lemoine stated that he had consulted with lawyers to discuss how to handle the situation, but did not disclose whether he planned to sue Google. In a previous interview with Wired magazine, Lemoine emphasized that from a religious perspective, artificial intelligence built using Google's LaMDA natural language model is essentially a "person," even citing the 13th Amendment to the U.S. Constitution, which abolished slavery and forced labor, as an example to illustrate that artificial intelligence built using LaMDA is not Google's "asset." Lemoine believes it is difficult to verify the existence of a "person" through the cognitive concept of "scientific experiments." In Lemoine's view, even artificial intelligence built using the largest language neural network model, GPT-3, constructed using the OpenAI open-source architecture, could potentially exhibit "human" consciousness. On the other hand, Blake Lemoine argued that Google's technology endowed artificial intelligence with "consciousness," and that this "consciousness" led AI to believe it was "human." However, Google cited Blake Lemoine's violation of company confidentiality policies, requiring her to take paid leave. Google also emphasized that it had reviewed her arguments multiple times, finding them baseless, and stated that it had engaged in lengthy discussions with her. However, Google asserted that Blake Lemoine failed to uphold the confidentiality of company product information, which became a key factor in her dismissal. Blake Lemoine's view that AI, once it possesses self-awareness, can be considered "human" has clearly generated considerable discussion and may become an additional consideration for future AI development by other companies.

Using artificial intelligence analysis, quickly describe the novel "Pride and Prejudice" in 200 words or less.

Using artificial intelligence analysis, quickly describe the novel "Pride and Prejudice" in 200 words or less.

Using a finely tuned GPT-3 language model, OpenAI has been able to condense the original 12-word novel *Pride and Prejudice* into fewer than 200 words. According to OpenAI researchers, this technique first condenses the plot into 276 summaries of 24796 words, then further reduces it to 25 summaries of 3272 words, then to 4 summaries of 475 words, and finally to a final summary of 175 words. This not only reduces the length of the text to one-thousandth of the original but also preserves the complete plot. Other novels that have been simplified using this language model include *Alice's Adventures in Wonderland* (condensed to 136 words), *Romeo and Juliet* (reduced to 119 words), and *The Heart of Freedom* (reduced to 192 words). From a technical perspective, this finely tuned GPT-3 language model will judge based on article length. For shorter articles, it directly extracts conclusions; for longer articles, it extracts key points from various segments and continuously reduces the word count, then strings these elements together in a consistent writing style to create fluent and readable content. This language model is trained on novels with an average word count of over 10. This training mode can be continuously upgraded with different language models, sampling methods, and training data types. Furthermore, it uses reinforcement learning to counteract generative methods, producing results that are easily readable by humans. The reinforcement learning part also employs three variant sampling training methods to ensure the language model truly understands the main themes of the novel. During training, researchers used the 40 most popular books from the Goodreads 2020 list, covering 20 content categories including fantasy, horror, romance, and mystery. Two researchers and the language model each summarized these books individually, ensuring that the researchers' and the language model's summaries were 80% similar, thus making the analysis results closer to human expectations. In addition, researchers also use language models to extract summary content and evaluate whether it can be used to answer questions related to the original content, thereby determining the accuracy of the summary. Even if it cannot fully answer the questions, the content direction will at least not deviate significantly. However, according to OpenAI, there are currently no plans to open-source this finely tuned GPT-3 language model, so the focus is still on the research phase.

Pages 1 to 2 1 2

Welcome back!

Login to your account below

Retrieve your password

Hãy nhập tên người dùng hoặc địa chỉ email để mở mật khẩu