Wendelin: Out-of-Core Pydata

profile_document

The wendelin.core library and the Wendelin platform provide two practical solutions to scale up pydata libraries and scripts beyond the limites of RAM. Wendelin thus eliminates the need to redevelop and port pydata scripts to another language and platform as soon as data grows. Pydata can now be used both for prototyping data analytics with Jupyter and for scalable deployment for Wendelin. Wendelin.core and Wendelin are both Free Software supported by Nexedi.

Last Update:2016-06-14
Version:001
Language:en

Contenu de la page

Wendelin technology transparently scales up PyData scripts created by data scientists to match the requirements posed by big data processing. A simple turorial is available online: https://nexedi.com/wendelin-Core.Tutorial.2016

Scaling PyData beyond RAM

Wendelin.core is a library that solves the common problem that many pydata analysts find as soon as data grows: RAM limit. Wendelin.core implements a distributed, shared, transactional virtual memory manager that combines the RAM and storage of a redundant array of inexpensive servers (RAIC) as if it were a single server with huge RAM and storage

Most of the time, Wendelin.core does not require to change the source of pydata libraries. ndarrays managed by wendelin.core appear to PyData as standard ndarrays. The use of wendelin.core is thus transparent.

Limitless Storage

Wendelin is a industrial big data platform based on wendelin.core library and NEO NoSQL database that provides a ready to run production environment to deploy PyData scripts and analyse large quantities of data. NEO NoSQL database can be extended at runtime by adding inexpensive servers with additional storage. Storage size thus becomes limitless.

Wendelin platform relies on fluentd standard for reliable and scalable data ingestion.

Parallel PyData

Wendelin platform includes a parallel processing engineer based on the "Actalk" model, a generalization of MapReduce that was actually created before MapReduce itself. Wendelin parallel processing can be plugged into the PyData libraries such as joblib or simply invoked directly from any script thanks to the activate() method which distributes computation over all nodes of the cluster.

Transactions for consistent analytics

Wendelin.core supports transactional processing. This means that there is no risk in Wendelin to process data in-place. If an in-place processing fails due to a software or hardware error, all modifications are reverted. Data remains consistent.

Transaction support also ensures in a production system that real data is either fully ingested or not ingested at all. Many corners cases resulting from the lack of transactions are thus eliminated.

Nexedi Support

Nexedi provides fulls commercial support for the wendelin.core and Wendelin platforms. For example, in some rare cases, PyData code relies explicitely on copying data in RAM. Such code has to be modified, and eventually contributed back to PyData community since explicit copies are generally a bad thing. Nexedi experts can support data scientists to put their scripts in production by eliminating all programming patterns that prevent scaling up PyData code.

Interested in Big Data Services? Get in Touch!

Need to store Big Data reliably? Looking for experienced data analysts' advice on data handling? Need to hit the ground running with a custom Big Data application? Nexedi is here to help! Successfull implementation examples can be found on both Wendelin. and Nexedis website. For further information and individual offers for your business case please contact us through our website's contact section.

Key Facts

Initial Release2015
Current Version0.5
Operating SystemGNU/Linux
LicenceGPL

Key Features

Out-of-Core
Parallel
Distributed Storage
Automated Infrastructure
Standardized Data Ingestion

Key Services

Expert Advice On Scalability
Expert Advice On Data Analysis
Big Data Application Development
24/7 Support

Key Industries

Accounting
Aerospace
Automotive
Banking
Energy
Government
Manufacturing
Tolling

Contact

Nexedi SA
270, BD Clémenceau
59700 Marcq-en-Baroeul
France

Phone+33 629 02 44 25
Mailinfo@nexedi.com
Webwww.nexedi.com