
XetHub raises $7.5M for its Git-based knowledge collaboration platform • TechCrunch
Seattle-based XetHub, a startup that makes it simple for companies to make use of Git for knowledge administration, right this moment introduced that it has raised a $7.5 million seed financing spherical led by Madrona. The essential thought right here is to permit builders to work with knowledge the identical means they work with code, together with all the collaboration includes a device like Git allows. The group describes XetHub as a “collaborative storage platform for knowledge administration.”
The corporate was co-founded by Yucheng Low (CEO), Ajit Banerjee and Rajat Arya, a group with years of expertise working with giant knowledge platforms. Certainly, Low beforehand co-founded ML startup Turi, the place Arya was the primary worker. Apple acquired the company in 2016, permitting Low and Arya to work on numerous components of Apple’s ML platform stack, with Arya main Apple’s knowledge platform group, for instance. It was additionally at Apple that the 2 met Banerjee, who beforehand labored at Inktomi, Amazon and Fb. He additionally beforehand based two startups.
XetHub repository view is designed for navigating and visualizing knowledge repositories whereas maintaining GitHub sensibilities. XetHub routinely summarizes widespread file codecs (CSV) and helps customized visualizations.
Throughout their time engaged on the information platform at Apple, the group realized that there was nonetheless plenty of room for enchancment within the knowledge administration realm.
“It actually shouldn’t come as a shock, however knowledge is way extra essential than every thing else. Extra essential than the mannequin — than the rest,” Low advised me. “Managing the place you retailer this knowledge, the way you collaborate on this knowledge is actually basic. Nonetheless, what we see is that the way in which we handle knowledge right this moment actually appears like how supply code was executed 30 years in the past — which suggests model management or collaboration is completed by copy-and-paste — typically there’s a extra elaborate model of it, however it’s nonetheless finally copy-and-paste if I wish to make certain nobody else is touching what I’m doing.”
Identical to builders have moved to instruments like Git for collaborating on their supply code, XetHub desires to permit them to make use of these identical acquainted primitives for working with knowledge.
“The way in which we give it some thought is that for the primary time, we really allow builders to work on knowledge in precisely the identical means as code,” Low mentioned. He famous that the group aimed to create a device that doesn’t simply mimic a Git-like expertise however one which preserves the core Git consumer expertise — together with all the integrations that builders are acquainted with.

XetHub extends Git to help giant recordsdata, providing environment friendly storage and switch with knowledge deduplication whereas sustaining full Git compatibility.
Presently, the service can deal with repositories with as much as 1TB of information, with plans to develop this to 100TB quickly. Few builders will wish to clone a big repository like this, so one nifty function right here is that builders may mount these repositories and make them behave like a neighborhood file system, regardless of whether or not that’s on their laptop computer or a big GPU cluster. It’s additionally price noting that the device is agnostic to file codecs.
From a advertising perspective, the group is focusing its efforts on AI/ML groups, however customers can clearly use XetHub for managing any type of knowledge.
Xethub is now publicly accessible with a free group version that you should utilize to handle as much as 20 GB of deduplicated storage. Low tells me the corporate is already speaking to some enterprise clients, however the group isn’t fairly prepared to call names but.
“Yucheng and the distinctive XetHub group have been innovating with machine studying for effectively over a decade, after which making use of their abilities on the most iconic client expertise firm – Apple. XetHub allows builders to work with giant datasets, in collaboration with others, to construct clever and generative functions,” mentioned Matt McIlwain, Managing Director, Madrona. “Creating and deploying these functions is constrained by legacy infrastructure and complicated knowledge workflows, and XetHub addresses these ache factors from the developer viewpoint.”