APMIC (Accelerate Private Machine Intelligence Company), an enterprise AI solutions provider founded in Taiwan in 2017 and a partner of NVIDIA in fine-tuning models, announced a collaboration with Twinkle AI, a Traditional Chinese language model research community, to launch Formosa-30, Taiwan's first Traditional Chinese language inference model with 1 billion parameters that can be run on mobile devices.
In addition, APMIC has also collaborated with Twinkle AI to open source the efficient evaluation framework "Twinkle Eval", which is used to evaluate the effectiveness of artificial intelligence models, thereby promoting the development of Taiwan's artificial intelligence technology and the promotion of local applications.
Formosa-1 was developed by APMIC and the Twinkle AI community, with technical support and experience sharing from the National High-Speed Network and Computing Center (NHNC) R&D team. It is Taiwan's first large-scale language model with 30 billion parameters that can run on mobile phones.
This model is trained using the APMIC PrivAI product model built on the NVIDIA NeMo end-to-end platform, using distillation technology. The model weights are fully open sourced under the MIT license, thereby promoting the development of open source artificial intelligence technology applications in Traditional Chinese in Taiwan.
To enhance its model reasoning capabilities, Formosa-1 was trained with data tailored to the Taiwan Chain of Thought (TCoT) and paired with NVIDIA NeMo Data Curator to accelerate data management services, ensuring better performance in legal reasoning, logical reasoning, and mathematical deduction.
In terms of language data construction, Formosa-1's training data covers 1000 billion high-quality Traditional Chinese tokens from diverse texts such as news, legal documents, essays, and social discussions, ensuring the AI's accurate understanding and application of Traditional Chinese contexts.
Twinkle Eval, an open-source evaluation framework designed for large-scale inference models, is deeply integrated with the NVIDIA NeMo Evaluator evaluation model to support large-scale parallel testing and ensure model stability and accuracy across multiple domains.
Twinkle Eval ensures test fairness by randomizing the order of options, preventing the model from memorizing a fixed order. It also incorporates a repeated testing mechanism to verify the model's stability through multiple independent inferences. The tool includes the Taiwan Mixed Question Bank for General and Professional Learning (TMMLU+), the Taiwan Legal Corpus Test Set (tw-legal-benchmark-v1), and the MMLU benchmark test set, ensuring both breadth and accuracy of testing.
In addition, through precise format control and error correction mechanisms, "Twinkle Eval" can effectively ensure the consistency of answer formats and reduce test error rates.








