Projects

The open-source projects I worked on. Maybe you can contribute or use them

SmartReader

A .NET Standard library to extract the main content of a web page

SmartReader is designed to remove the clutter from a web page: ads, sidebars, etc. and get you just the content. The core algorithm is a port of the Mozilla Readability library. The original library is stable and used in production inside Firefox. By relying on a library maintained by a competent organization like Mozilla we can piggyback on their hard and well-tested work.

SmartReader also adds some improvements on the original library, getting get more and better metadata:

site name
an author and publication date
the language
the excerpt of the article
the featured image
a list of images found (it can optionally also download them and store as data URI)
an estimate of the time needed to read the article

Projects

SmartReader

I can only offer my genius

Topics