News archive

Amlet selected by IBM Hyper Protect Accelerator

posted Sep 16, 2020, 6:49 AM by Enrico Fagnoni   [ updated Sep 16, 2020, 7:17 AM ]

Today (15/09/2020), during the IBM Z Day, It was officially announced that our Amlet project has been selected by the IBM Hyper Protect Accelerator for 2021 as one #2 of most promising 15  projects in the world,  and #1 in Europe. Amlo was admitted to Cohort 2 program in San Francisco Accelerator.

Amlet is a part the Mopso product suite devoted to AML that is able to automate the Customer Due Diligence process throw innovative cryptographic, privacy preserving algorithms.

Anassimene, the third generation of SDaaS is ready

posted Jun 6, 2020, 5:52 AM by Enrico Fagnoni

2020.06.05 today LinkedData.Center labs announce the pre release on SDaaS platform, Two times faster and smart than the previous version.

Anassimene is used in Mopso Brain and Mopso Net products

mopso selected by mps

posted Apr 4, 2020, 6:18 AM by Enrico Fagnoni

mopso has been selected for the OfficinaMPS open banking initiative, the permanent laboratory of Banca Monte dei Paschi di Siena dedicated to innovation.

the MOPSO project started

posted Apr 4, 2020, 6:13 AM by Enrico Fagnoni

Watch out for the robot's error

posted Jun 7, 2019, 7:31 AM by Enrico Fagnoni

The result obtained from an algorithm based on neural networks cannot be explained. Moreover, it always has a statistical error, which is often also quantifiable.

Lack of proof is the fundamental difference between neural networks and other A.I. tools like, for example, the inferential systems based on the open world assumption (i.e. rules systems that are tolerant of any lack of information). Such kind of A.I. systems, unlike neural networks, are always able to motivate their choices. The Semantic Web is the most known example.

The prevailing trend collapses the whole  A.I. ​​on machine learning only  (in particular on neural networks), but there are many ways of doing things. The technique of making a machine learn by example is undoubtedly the one that requires less cognitive effort on the part of human beings, and for that, perhaps it generates so many expectations.

In order for things to work, we always need a logical-deductive substrate, which perhaps, in a more or less distant future, could also be deduced from a machine but which, for now, MUST always be modeled "by hands" and it must always be an integral part of every automatic system that takes decisions.

In other words, you always need to insert the models generated through machine learning in a formal logical context, which evaluates rules defined by humans and based on socially shared conceptualizations.  To build this logical model,  you need to think a lot, discuss a lot and work hard to formalize it, maybe that's why we tend to pretend it's not needed.

In exchange for such significant work, you can always know what you're talking about, what you're doing and why you're doing it. 

Machine Learning: oh yeah?

posted May 17, 2019, 1:51 AM by Enrico Fagnoni   [ updated May 17, 2019, 1:54 AM ]

Recently, someone starts to speak about the training of AI models while maintaining learning data privacy (e.g. "A Demonstration of Sterling: A Privacy-Preserving Data Marketplace" Nick Hynes, David Dao, David Yan, Raymond Cheng, Dawn Song VLDB Demo 2018. )

In my opinion, the ML model creation makes sense only on public data sets or, in any case, on data verifiable by those who will use the model obtained from them. Otherwise, the results are indistinguishable from random values that should be revalidated experimentally on each update of the model.

Using a "black box" model is a total act of faith. We fall into the "reputational" scheme where the crowd decides what is right and what is wrong without having the elements to do so.  At the very least, those who produce an opaque model should be responsible for the errors produced by the model they created.

The risk of switching from fake news to fake data and/or fake models is very high. 

We need the "Oh yeah?" button

In this regard, I invite you to read this passage taken from an article in 1997 by Sir Tim Berner Lee that I copy here for brevity:

Deeds are ways we tell the computer, the system, other people, the Web, to trust something. How does the Web tell us?

It can happen in lots of ways but again it needs a clear user interface. It's no good for one's computer to be aware of the lack of security about a document if the user can ignore it. But then, most of the time as user I want to concentrate on the content not on the metadata: so I don't want the security to be too intrusive. The machine can check back the reasons why it might trust a document automatically or when asked. Here is just one way I could accept it.

At the toolbar (menu, whatever) associated with a document there is a button marked "Oh, yeah?". You press it when you lose that feeling of trust. It says to the Web, "so how do I know I can trust this information?". The software then goes directly or indirectly back to metainformation about the document, which suggests a number of reasons. These are like incomplete logical proofs. One might say,

"This offer for sale is signed with a key mentioned in a list of keys (linked) which asserts that tthe Internet Consumers Association endoses it as reputable for consumer trade in 1997 for transactions up to up to $5000. The list is signed with key (value) which you may trust as an authority for such statements."

Your computer fetches the list and verifies the signature because it has found in a personal statement that you trust the given key as being valid for such statements. That is, you have said, or whoever your trusted to set up your profile said,

"Key (value) is good for verification of any statement of the form `the Internet Consumers Association endorses page(p) as reputable for consumer trade in 1997 for transactions up to up to $5000. '"

 and you have also said that "I trust for purchases up to $3000 any page(p) for which `the Internet Consumers Association endorses page(p) as reputable for consumer trade in 1997 for transactions up to up to $5000."

The result of pressing on the "Oh, yeah?" button is either a list of assumptions on which the trust is based, or of course an error message indicating either that a signature has failed, or that the system couldn't find a path of trust from you to the page.

Notice that to do this, we do not need a system which can derive a proof or disproof of any arbitrary logical assertion. The client will be helped by the server, in that the server will have an incentive to send a suggested proof or set of possible proof paths. Therefore it won't be necessary for the client to search all over the web for the path.

The "Oh, yeah?" button is in fact the realively easy bit of human interface. Allowing the user to make statements above and understand them is much more difficult. About as difficult as programming a VCR clock: too difficult. So I imagine that the subset of the logic language which is offered to most users will be simple: certainly not Turing complete!

The hype of ML and of AI must not let us forget that some problems, and the solutions, are even older than the internet.

Re-thinking applications in the edge computing era

posted Mar 17, 2019, 4:51 AM by Enrico Fagnoni   [ updated Mar 18, 2019, 2:18 AM ]

The EU GDPR directive was a cornerstone in Information Society. More or less, it states that the ownership of data is an inalienable right of the data producer; before GDPR the data ownership was something marketable. Now, to use some else data, you need always get permissions that can be revoked anytime. Beside this, IoT requires more and more local data processing driving the edge computing paradigm.

Recent specifications like SOLID and IPFS promise radical but practical solutions to move toward a real data distribution paradigm, trying to restore the original objective of the web:  knowledge sharing. 

This view, where each person/machine has full control of his data, contrasts with the centralized application data architecture used by the majority of applications. 
Many signs tell us that this new vision is gaining consensus, both in the political and social world;  but today, even when applications claim to be distributed (e.g. Wikipedia), as a matter of fact, they still adopt a centralized data management architecture.

According to Sir Tim Berner Lee, "The future is still so much bigger than the past". To be ready, we need to rethink data architectures, allowing applications to use information produced and managed by someone, people or machines, out of our control.

The  Eric Brewer theorem (also known as CAP theorem), states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:
  • Consistency: Every read receives the most recent write or an error
  • Availability: Every request receives a (non-error) response – without the guarantee that it contains the most recent write
  • Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
CAP is frequently misunderstood as if one has to choose to abandon one of the three guarantees at all times. In fact, the choice is really between consistency and availability only when a network partition or failure happens; at all other times, no trade-off has to be made. 

But in a really distributed data model, where datasets are not in your control, network failure is ALWAYS an option, so you have always to chose.

Dynamic caching is probably the only practical solution to face the dataset distribution problem, but as soon as you replicate data, a tradeoff between consistency and latency arises.

Daniel J. Abadi from Yale University in 2010 found that even (E) when the system is running normally in the absence network errors, one has to choose between latency (L) and consistency (C). This is known as the PACELC theorem.

What all this does it means? You must start rethinking applications forgetting the deterministic illusion that functions return the same outputs when you provide the same inputs.
In fact, the determinism on which much of today's information technology is based should be questioned. We have to start thinking about everything in terms of probability.

That's already happening with search engines (you do not get the same result for the same query), or with social networks (you can't see the same list of messages). It is not a feature, it's due to technical constraints but Facebook, Google, and many other companies cleverly turned this problem into an opportunity, prioritizing ads, for instance.

If the edge computing paradigm will get the momentum,  all applications, also the corporate ones, will have to address similar issues. For instance, the customer/supplier registry could (or should ) be distributed.

Technologies and solutions such as IPFS,  Linked Data, and RDF Graph Databases provide practical solutions to caching and querying distributed dataset, helping to solve inconsistencies and performance issues. But they can not be considered a drop-in replacement of older technology: they are tools to be used to design a new generation of applications that are able to survive to the distributed dataset network.

Introducing the Financial Report Vocabulary

posted Feb 19, 2019, 7:33 AM by Enrico Fagnoni   [ updated Feb 19, 2019, 7:33 AM ]

The Financial Report Vocabulary (FR) is an OWL vocabulary to describe a generic financial report.

The FR vocabulary can be used to capture different perspectives of report data like historical trends, cross-department, and component breakdown.

FR extends the W3C RDF Data Cube Vocabulary and it is inspired by the Financial Report Semantics and Dynamics Theory.

New KEES specifications

posted Feb 19, 2019, 7:19 AM by Enrico Fagnoni   [ updated Feb 19, 2019, 7:27 AM ]

In order to let computers to work for us, they must understand data: not just the grammar and the syntax, but the real meaning of things.

KEES (Knowledge Exchange Engine Service) proposes some specifications to describe a domain knowledge in order to make it tradeable and shareable.

KEES allows to formalize and license:

  • how to collect the right data,
  • how much you can trust in your data,
  • what new information you can deduct from the collected data,
  • how to answer specific questions using data

A.I. and humans can use these know hows to reuse and enrich existing knowledge. KEES is a Semantic Web Application.

KEES Overview

Released µSilex

posted Oct 1, 2018, 12:32 PM by Enrico Fagnoni   [ updated Feb 19, 2019, 7:12 AM ]

µSilex (aka micro Silex) is a micro framework inspired by Pimple and PSR standards. All with less than 100 lines of code!

µSilex is a try to build a standard middleware framework for developing micro-services and APIs endpoints that require maximum performances with a minimum of memory footprint.

Middleware is now a very popular topic in the developer community, The idea behind it is “wrapping” your application logic with additional request processing logic, and then chaining as much of those wrappers as you like. So when your server receives a request, it would be first processed by your middlewares, and then after you generate a response it will also be processed by the same set:
It may sound complicated, but in fact, it’s very simple if you look at some examples of what could be a middleware:

  • Firewall – check if requests are allowed from a particular IP
  • JSON Formatter – Parse JSON post data into parameters for your controller. Then turn your response into JSON before sending ti back
  • smart proxies - forward a request to other servers filtering and enriching the message payload.

1-10 of 48