Tag: Cloud collaborative computing

Google Cloud launches Cloud Run GPU service, which allows flexible configuration of NVIDIA L4 GPUs in cloud architecture, based on computing needs.

Google Cloud launches Cloud Run GPU service, which allows flexible configuration of NVIDIA L4 GPUs in cloud architecture, based on computing needs.

Google Cloud recently announced the launch of its Cloud Run GPU service, allowing users to utilize NVIDIA L4 GPUs in the cloud through automatic scaling and flexible deployment. This service is primarily designed for workloads such as artificial intelligence computing and inference training. Since there's no need to pre-apply for and configure GPU configurations, the number of GPUs is automatically and flexibly allocated based on computing needs, preventing idle GPU resources and additional costs. This increases deployment flexibility and simplifies management through automated deployment. The service is billed per second and automatically resets to zero when not in use. In a cold start state, the GPU and driver can be started in approximately 5 seconds. For example, in Gemma 3 inference with 40 billion parameters, it takes only about 19 seconds from cold start to generating the first token, meaning it can start up very quickly. The Cloud Run GPU service itself can be directly added to applications or enabled via the application service console. Since it is offered in a flexible configuration, Google Cloud emphasizes the reliability of this service and states that users or enterprises can deploy it in multiple regions according to their operational needs. They can also disable partition redundancy to adjust the overall available computing resources. Currently, the Cloud Run GPU service is available in multiple Google Cloud regions in the United States, Europe, and Asia.

Apple's AI cloud collaborative computing architecture uses the same network IP masking technology as iCloud Private Relay.

Apple's AI cloud collaborative computing architecture uses the same network IP masking technology as iCloud Private Relay.

In a subsequent interview, Craig Federighi, Apple's Vice President of Software Engineering, stated that during his WWDC 2024 keynote address on "Apple Intelligence," Apple mentioned that if AI-related applications require cloud-based collaborative computing, Apple will still send data to its own independent servers for further processing. He also revealed that Apple designed this computing architecture with considerable caution and that its underlying operating principle uses the same network IP masking technology as iCloud Private Relay. ▲ Apple uses the "Private Cloud Compute" cloud-based collaborative computing architecture to compensate for insufficient computing power on devices. Craig Federighi explained that in this "Private Cloud Compute" process, which uses Apple's own independent servers for AI collaborative computing, users upload only specific content to the cloud, rather than uploading all information. Furthermore, network IP masking technology hides the user's online information, allowing users to complete cloud-based collaborative computing in a de-identifiable manner. Furthermore, in this computational process, all user data will not be separately backed up on Apple servers or stored on Apple servers. All data uploaded to the cloud server is only used for a single computation, and uploaded data can only be used for the collaborative computation requested by the user, and cannot be used for additional computations such as data analysis or artificial intelligence model learning. All completed computations and data returns must also be verified against the user's device to ensure data security. ▲ It is emphasized that all data will not be stored on Apple's own independent servers, and all computations are only applicable to the scope of user requests. Strict verification ensures the correctness of user identities. Apple also revealed that "Private Cloud Compute" is built on Apple Silicon processors, meaning that this computing architecture is only applicable to devices using Apple's own processors, just as the current "Apple Intelligence" function requires devices with A17 Pro processors or M1 or higher specifications.

Google strengthens cloud collaborative computing and artificial intelligence infrastructure configuration and plans to introduce NVIDIA Blackwell accelerators

Google strengthens cloud collaborative computing and artificial intelligence infrastructure configuration and plans to introduce NVIDIA Blackwell accelerators

To optimize cloud-based collaborative computing and configure AI infrastructure, Google announced that its next-generation AI computing accelerator, TPU v5p, unveiled at the end of last year, is now available globally. Starting in May, it will also incorporate NVIDIA's H100 accelerator, codenamed "Hopper," to create a computing device called A3 Mega. Additionally, it plans to introduce NVIDIA's recently unveiled next-generation accelerator, codenamed "Blackwell," with the GB200 NVL72 computing system expected to be deployed in early 2025. ▲Google will begin incorporating NVIDIA's H100 accelerator, codenamed "Hopper," in May to create a computing device called A3 Mega. Previously, when introducing the new TPU v5p, Google touted its scalability and flexible deployment capabilities, highlighting it as Google's most powerful tensor accelerator to date. It can achieve twice the computing power, more than three times the memory bandwidth, and near-linear data throughput on a single processor. It can also support four times the size of next-generation AI models while simultaneously reducing training time for existing models by 2.8 times. A single TPU v5p pod will consist of 8960 chips, more than double the number of chips used in the TPU v4 pod, thus meeting the demands of large-scale artificial intelligence computing. ▲Google's most powerful Tensor Accelerator to date, the TPU v5p. In addition to driving artificial intelligence computing, Google emphasizes that the new generation of TPUs will accelerate more cloud tasks and further improve the performance of Google's services such as Search, YouTube, Gmail, Google Maps, and online services including the Google Play Store. It will also enable faster performance for many Android devices combined with cloud-based collaborative computing applications, and provide a more convenient user experience when combined with on-device computing. Besides incorporating its own TPUs and NVIDIA accelerated computing components, Google will continue to collaborate with AMD to integrate its accelerated computing products, thereby providing more diverse options for artificial intelligence acceleration. ▲The GB200 NVL72 computing system is expected to be introduced in early 2025. In addition to continuing to accelerate AI computing efficiency with its own TPUs, Google also plans to accelerate overall computing efficiency with the custom Arm Neoverse V2 core architecture processor "Axion". Compared to traditional x86 architecture processors, this represents a 50% improvement in execution efficiency and a 60% reduction in power consumption. It also emphasizes a 30% increase in execution performance compared to current Arm architecture processors used for cloud collaborative computing, thereby significantly reducing carbon emissions. ▲The custom Arm Neoverse V2 core architecture processor "Axion" accelerates overall computing efficiency, representing a 50% improvement in execution efficiency and a 60% reduction in power consumption compared to traditional x86 architecture processors. On the other hand, Google continues its collaboration with Intel to introduce the fifth-generation Xeon Scalable server processor, thereby creating the previewable C4 and the N4 virtual machine currently available globally. Additionally, the C3 virtual machine for bare-metal use will also be launched. ...

Welcome back!

Login to your account below

Retrieve your password

Hãy nhập tên người dùng hoặc địa chỉ email để mở mật khẩu