DATA SCIENCE PLATFORM
Data Science is a fast-evolving concept in the field of our digital journey. It uses many scientific and mathematical algorithms to make meaningful inferences out of varied datasets. One of the new tools, Machine Learning is helping data scientists to build more powerful and real-life solutions.
Data Science is NOT new but with the currently available software and infrastructure platforms, it can address more complex problems than earlier.
Technology
MPCL provides a solution called Data Science Platform from Iguazio. This is an end-to-end data science platform that lets you get started with Data Science without any hassle of DevOps. It provides key open source components in a containerised environment running a container orchestration tool called Kubernetes. Using Iguazio, a data engineer can cut the time to prepare a data store while the data scientist can setup the development environment in no time. Finally, Nuclio serverless with automated deployment and workflow automation with Kubeflow.
For developer
Iguazio can not only function on cloud but also on-prem CPU/GPU servers. Data Science Platform offered by Iguazio makes a Data Scientist’s life easier by eliminating DevOps and saving them time for research, code and experimentation, all of this happens in a single-click providing you a Jupyter notebook for development. Iguazio offers a Data Layer which enables Data Engineer to collect and enrich Data from any source and ingest in real-time multi-model data at scale, including event-driven streaming, time series, NoSQL, SQL and files. Horovord is a distributed training framework that is required when it comes to utilizing multiple GPUs for a single training task. Coding on Horovord is not a simple task, this pain-point is addressed by Iguazio, they provide Horovord as a service which automates the distributed training task and lets the Data Scientist focus on improving the model and training it faster.
Application Acceleration
Iguazio delivers Kubeflow as a service which offers a simple UI thus eliminating the use of command line for Kubernetes deployment. As mentioned above, a containerised environment accelerates the workflow by eliminating DevOps. Horovord as a service automates distributed training thus accelerating the training process which sometimes takes days on a single GPU, same job could be accelerated to be finished in an hour utilizing multiple GPUs.Deployment is automated using Nuclio serverless enabling Data Scientist to deploy models and APIs from a Jupyter notebook or IDE to production in just a few clicks and continuously monitor model performance.