We are a Netherlands-based start-up company that makes big data reliable and accountable, delivering trustworthy analytics and AI solutions. We validate multiple data sources and are able to merge private and proprietary data with open data. We bring novel insight into policy and business problems, as well as scientific research. Our work addresses the potentially negative effects of black-box proprietary algorithms. Our diverse team is particularly experienced in music, the creative industries, and digital humanities, where data is scattered in small organizations.
![]() | ![]() | ![]() | ![]() | ||
![]() | ![]() | ![]() | ![]() |
Our flagship demo projects are the Listen Local ethical music recommendation system based on our Demo Music Observatory data integration and knowledge sharing platform. We have validated our product/market fit in the prestigious Yes!Delft AI+Blockchain Lab. We are members of the Dutch AI Coalition and participate in the work of the European AI Alliance.
See our services: data curation, open data access, survey harmonization, reproducible research and validated trustworthy AI applications.
Download our introduction.
Follow news about us or the more comprehensive Data & Lyrics blog.
Contact us.
Company details: Registration number
: 80205275 (KvK Chamber of Commerce); founding document
: 000046572589 (vestigingnummer); VAT number
NL 861587893B01.
From open data and open-source statistical software to data-as-service.
![]() | ![]() | ![]() | ![]() |
For more posts, visit our blog Data&Lyrics
Our peer-reviewed, open source statistical software packages
We believe that transparency is the key to the highest data quality. We use only open source software. We open up the critical elements of our software for peer-review.
We use open-source software, there is no vendor lock-in.
Our data products go through many, automated (unit) tests, replacing countless error-prone human validation working hours.
The critical elements of our code go through external validation and peer-review by computational statisticians and data scientists.
The goal of retroharmonize is to facilitate retrospective (ex-post) harmonization of data, particularly survey data, in a reproducible manner.
Our report highlights some important lessons. First, we show that in the era of global music sales platforms it is impossible to understand the economics of music streaming without international data harmonization and advanced surveying and sampling. Paradoxically, without careful adjustments for accruals, market shares in jurisdictions, and disaggregation of price and volume changes, the British industry cannot analyze its own economics because of its high level of integration to the global music economy. Furthermore, the replacement of former public performances, mechanical licensing, and private copying remunerations (which has been available for British rightsholders in their European markets for decades) with less valuable streaming licenses has left many rightsholders poorer. Making adjustments on the distribution system without modifying the definition of equitable remuneration rights or the pro-rata distribution scheme of streaming platforms opens up many conflicts while solving not enough fundamental problems. Therefore, we suggest participation in international data harmonization and policy coordination to help regain the historical value of music.
Why are the total market shares of Slovak music relatively low both on the domestic and the foreign markets? How can we measure the market share of the Slovak music in the domestic and foreign markets? We offer some answers and solution based on empirical research and with the creation of a database and an AI application.
Our paper argues that fair competition in music streaming is restricted by the nature of the remuneration arrangements between creators and the streaming platforms, the role of playlists, and the strong negotiating power of the major labels. It concludes that urgent consideration should be given to a user-centric payment system, as well as greater transparency of the factors underpinning playlist creation and of negotiated agreements.
While the US have already taken steps to provide an integrated data space for music as of 1 January 2021, the EU is facing major obstacles not only in the field of music but also in other creative industry sectors. Weighing costs and benefits, there can be little doubt that new data improvement initiatives and sufficient investment in a better copyright data infrastructure should play a central role in EU copyright policy. A trade-off between data harmonisation and interoperability on the one hand, and transparency and accountability of content recommender systems on the other, could pave the way for successful new initiatives.
The topic of the paper is Library Genesis (LG), the biggest piratical scholarly library on the internet, which provides copyright infringing access to more than 2.5 million scientific monographs, edited volumes, and textbooks. The paper uses advanced statistical methods to explain why researchers around the globe use copyright infringing knowledge resources. The analysis is based on a huge usage dataset from LG, as well as data from the World Bank, Eurostat, and Eurobarometer, to identify the role of macroeconomic factors, such as R&D and higher education spending, GDP, researcher density in scholarly copyright infringing activities.
Publications featuring our datasets and technology
Our peer-reviewed, open source statistical software packages