The fascination with artificial intelligence (AI) is at an all-time high. The advent of GPT-4 has intensified the interest in AI's advantages, leading businesses to allocate substantial resources to explore how automation and AI can enhance their operations. Among the benefits that captivate executives the most are streamlining business processes (74%) and facilitating the development of new products and services (55%).
Machine Learning (ML) is one of the most difficult aspects of AI to deploy. According to Gartner research, 85% of ML projects fail to deliver, and a mere 53% of them progress from the prototype to production stage. So, how can engineers and developers tap into ML's potential?
Challenges
The time required to gain insights from ML projects can be a significant barrier to success for many organizations who don’t have the time or resources required to adequately build the supporting infrastructure. As per Anaconda's State of Data Science 2022 report, 65% of organizations lack the investment in tools required for high-quality ML production.
Indeed, Google research discovered that under current practices, eight hours of ML engineering necessitates a substantial amount of preparatory work: 24 hours for feature engineering, 32 hours for data engineering and 96 hours for infrastructure engineering. This suggests that only 5% of the total 160 hours of work is dedicated to ML engineering.
Developers find themselves investing enormous amounts of time managing and reconfiguring intricate components throughout their infrastructure. Consequently, ML may be inaccessible to smaller companies. ML projects can be compared to mining for diamonds, where a vast number of hours and labour are put into gaining a seemingly small but hugely valuable reward. As such, only bigger organisations are only to capitalize on the opportunities.
The significance of open source
However, there is another way. The time and resource demands of ML, such as spending two months learning platforms like AWS SageMaker before accessing insights, are prompting a shift toward open source. For smaller businesses, open source methods help reduce complexity and
lower entry barriers, as well as being a cost-effective and resource-efficient method for executing ML algorithms.
Open source tools are not only more sought after but can also be of higher quality than their proprietary counterparts. Devoid of proprietary constraints, open source tools can be easily modified for specific use cases, thereby simplifying the process of deriving insights from ML that are tailored to specific organizational demands.
Importantly, open source provides businesses with a means to access top-notch ML expertise that they may not possess internally. Kubeflow, for example, receives contributions from a wide range of industry experts, helping it to better deploy and run ML workflows on cloud-native Kubernetes infrastructure. The latest release, Kubeflow 1.7, garnered code contributions from over 250 individuals from the tech world.
What's next?
To promote ongoing adoption and accessibility of ML, the industry must collaborate as a community to establish a flourishing open source cloud ecosystem, beginning with interoperable tools.
Developers want user-friendly tools that don't necessitate lengthy onboarding processes to get algorithms up and running and that position them well to start reaping the benefits of AI. Easy access is a must, and they don’t want to have to continuously learn how to operate different proprietary tooling.
Various technical solutions can also address ML's challenges. One particularly promising avenue is GPU Edge boxes, which enable ML to run effectively across diverse use cases and workloads, including on premise and cloud. GPU instances themselves are built on quick launch times, pooled bandwidth, and transparent pricing, providing companies with a fast, cost-effective way to adopt ML without any unforeseen expenses.
The potential of ML is immense. By equipping the developer community with the open source tools they require, the sky's the limit