Skip to main content

My quick thoughts, back stage, and rants as I try to Teach kids about the Web while learning how to help others build a better Web.

IndieWebRing

Come Journey Through the IndieWeb Sites

🕸💍

Greg McVerry

@edheil Anyone can have free micro.blog account and syndicate with RSS or JSON from anywhere, but if you want a micro.blog page (which is also its instance on the Fediverse and you can follow people from Mastodon) you need to pay, worth it for positive community

Greg McVerry

Testing my @withknown install to see if I edited the templates for notes to have post syndicate to my fediverse instance: https://github.com/jgmac1106/Known/blob/master/IdnoPlugins/Status/templates/default/entity/Status.tp...

Greg McVerry

@kenbauer this is supposed to be the fediverse version of my site: quickthoughts.jgregorymcverry.com@quickthoughts.jgregorymcverry.com but something is wrong and nothing is getting sent out. We think it is something to do with my Apache set up not sure.

If you could try https://fed.brid.gy be nice to test against other @Reclaimhosting sites

Greg McVerry

Greg McVerry

@bytebot Couldn't agree more! The web is my fediverse A lot of us been using IndieAuth, allows us to log in with our domains and then verify with a third party like Twitter or GithhUb . Made this img to explain how works: https://indieweb.org/graphics#Illustrations_and_Sketch_Notes

Greg McVerry

You could directories, each ring could have a Code of Conduct, be like federation but without the complication of the fediverse . Just a bunch of websites chilling together.

Greg McVerry

Scoping Out Basics of #IndieWeb Search

4 min read

Over the weekend I met with the CEO of BLUR Search Technologies . Jaime is also my brother-in-Law, and has  sponsored IndieWebCamp NYC in 2018. We mainly gathered for Thanskgiving, the second Thanksgiving, and finally leftovers.

As we all played clean the fridge we snuck away to scope out a possible search engine for the IndieWeb Community. Blur Search Technologies will donate time and technology but we will need some help in implementing some building blocks  IndieAuth, Post Type Discovery Algorithm, etc.

We will also check out and see how much of indiemap.org. I think it will be a ton, plus we have data already to play with. 

Opt-in with IndieAuth

Yes many of us publish openly, even with liberal licenses that allow for remixing and forking but this does not mean we want the data scraped, parsed, and sorted. The right thing and what you have the right to do are not always the same.

Thus the first feature we would need to have would be an opt-in service using the IndieAuth protocol. Meaning the only website data the search engine would collect would be that which you authorized.

Grant Richmond has done this well with the h-card directory. Speaking of which...

Types of Tables

We first discussed what types of tables and data are available to fill these tables. We did not decide if each top level h* would get a table or we would the h* as the first column.

  • h-entry
  • h-review
  • h-feed
  • h-card
  • h-cite
  • h-feed

Again we looked at Grant Richmond's UI, but the h-card directory would get parsed as soon as someone joins the search engine.

Indexing Sites

A feed reader could then be used to index sites. Using the post type discovery algorithim and existing microformats parsers we can add columns for all the properties used in:

For large blogs with decades and gigs of post we will index the pages overtime in the background. Adding sites quickly gets more expensive even quicker.

Queries

Some queries, like those involving people would get hard coded into the search engine. You could ask:

  • Where is @x? -Then the search engine would qury the chekin posts for that person and tell you the last known location
  • Who is @x? Will present the the h-card of a person. If there is a p-note or p-summary present then a tagline will appear in the results.
  • What is @x Mastodon name? Queries the directory and finds the rel-me link
  • What (movie, book, podcast) is most popular? This would query the frequency of "p-name" in the h-cite" of any watch, read or listen post (or whatever is the corect answer, much of this is new). These queries could of course be date restricted.

Keyword Search

The keyword search would look for exact matches in:

  • first p-name after the h-*
  • p-category or rel="tag"
  • content
  • h-cite

These could then be weighted in some form of ranking

  • +100 if keyword in the p-name and alo p-category
  • +50 if p-name
  • +25 if p-category
  • +10 for each exact match in the content

Next Steps

We needed to scope out an MVP which this blog post now completes. Next we will start working on testing the different microformats to json parsers to populate tables with dynamic columns to see which can be static columns.

We will start with my blog but need a few other volunteers. Find me in chat if interested.

Update:Ryan Barret

reminded me of https://indiemap.org, which already has data to muck around in and a prior example of some crawling technology.

We also need help from people with experience using the IndieWeb building blocks.

Big Questions?

Can we add a micropub client so if you are signed into the search engine you can reply and interact with the results?

Can we develop APIs so people could add the search engine natively to their blogs for both local and network searches?

Could a private search enging help protect vunerable blogging communities by controlling not only who can use the search engine but giving uvers full control over what data is parsed?

Overall I think an opt-in search engine, where you can add and subtract your data as easy as every other time you use IndieLogIn will be great for the community. Search technologies combined with existing building blocks the already created such a search tool would be useful to other consumable feeds in the as well.

Greg McVerry

@xolotl Also the across the

Greg McVerry

@Cambridgeport90 I am in Eastern CT. Boston a quick hop and a jump. Stop by our WordPress channel to learn about WordPress Fediverse anytime...

Though I am most excited about microblog. Could be WordPress >micro.blog >Mastodon might be amazing

Greg McVerry

Excited to see rel=me go live on Mastodon: https://www.zylstra.org/blog/2018/10/mastodon-rel-me/ time to control your identity. I was never famous or cool enough to get a blue check on @Twitter, glad to see the fediverse have a more equitable approach to identity and verification