Creating Public Goods for the Ecosystem
We believe in enabling the open-source community and creating public goods along the way. The path to achieving this profound goal is not as straightforward as it may seem, we will have to unshackle the community from the crutches of non-transparent data, and here's our initial plan to achieve the same:
Open-Sourced Datasets
Imagine if every developer, researcher, or enthusiast had access to high-quality, transparent datasets to fuel their AI projects. Today, much of the data used to train AI models is locked away behind corporate walls or obscured by complex licenses. This lack of accessibility hinders innovation and keeps valuable resources out of the hands of those who could use them to make meaningful contributions.
We're changing that. When institutions use Session's network for web scraping or data collection, the data they gather doesn't just sit in private silos. Instead, it's automatically shared to help build comprehensive public datasets. Each piece of data, along with its metadata, is recorded on a blockchain. This means anyone can trace where the data came from, ensuring its authenticity and integrity.
By creating these open-sourced datasets, we're leveling the playing field. Students, independent developers, startups, and researchers worldwide will have free access to the data they need to train and improve AI models. This democratization of data empowers a diverse range of voices and ideas, fostering innovation and transparency that benefits everyone.
Fully Transparent & Verifiable LLM
Large Language Models have the potential to revolutionize how we interact with technology, but they often come with a significant drawback: a lack of transparency. Users typically have no insight into how these models were trained or what data influences their responses. This opacity can lead to issues like unintentional bias, misinformation, and a general mistrust of AI systems.
We're taking a different approach. By training an LLM on our open-sourced datasets - with every data point's origin verifiable on-chain, we're building a model that's transparent from the ground up. Users and developers alike can see exactly what data the model was trained on, fostering trust and enabling more informed use of the technology.
But transparency isn't our only goal, privacy matters too. We leverage Nillion's advanced cryptographic technology to perform computations on encrypted data without decrypting it. This means that the model can be trained and operated without exposing sensitive data.
The implications are profound. We're creating an AI that not only provides powerful capabilities but does so in a way that's accountable and respectful of users' privacy. It's an approach that could redefine how AI systems are built and deployed, setting new standards for ethics and transparency in the industry.
Why This Matters
By focusing on transparency and community collaboration, we're addressing some of the most pressing challenges in AI development today. Lack of access to quality data and opaque AI models hinder progress and erode trust. We're tackling these issues head-on by:
Trustworthy AI Applications: Users can have confidence in AI services knowing that they are built on transparent and verifiable foundations.
Regulatory Compliance: Transparent practices align with emerging regulations that demand accountability and explainability in AI systems.
Ethical AI Development: By prioritizing transparency and verifiability, we contribute to the development of AI technologies that respect user rights and societal values.
Last updated