Bookmarks are so limited. Unless you can remember the page title or the url you may never find that oh so important page again, that one you know you saw once that had that one piece of information you need to finish your article or end a debate. To save me the tedious task of trawling through my web history, I created Newsicles, a database of bookmarked news articles that strips them of advertizing and other cruft (using this python version of readability) and provides full text search of their content. Then I added tagging to the articles to group them around subjects and story ideas. When I found even tagging too tedious, I added a Natural Language Processor that guessed at the main subject based on frequency, and identified people, places and organizations within the text. Eventually, I want to create a database of those newsworthy entities -- the fundamental particles of news -- because that is how I want to read the news: as clusters of information around different entities, rather than wading through a series of reports on whatever ephemera a journalist classified as an "event" on any particular day.

The project is based on Django and written almost entirely in Python (except for the Firefox plugin). Although its appy nature makes it prime fodder for a Javascript web framework, I've bent over backwards to build the whole thing with the Django toolkit.

The code isn't posted right now. I deleted my Gitlab account recently and I haven't had time to start a new one. Stay tuned, or get in touch if you would like to know more about this.


For 15 years, Conrad lived in Mexico, where he worked as a freelance radio and print journalist. His work took him into a minefield, a gunfight, a dugout canoe and the homes of many fascinating, brave and generous people. He is also an avid teacher and has led classes in radio, robotics, soccer, physics and anthropology in a diversity of places, including office towers, lecture halls, fields and palm thatched huts.