Skip to main content

My quick thoughts, back stage, and rants as I try to Teach kids about the Web while learning how to help others build a better Web.

Greg McVerry

@ruxton perfect. Left you a !tell come on board as Organizer. Let's make sure Australian community can go to @IndieWebCamp online

Greg McVerry

dani@00dani.me @ruxton @mrkndvs @vanessa @wentale https://www.serenawho.com/ @eddiehinkle organizing an online @indiewebcamp if Australian interest could run live sessions (keynotes, intros, demos) in your timezone.. We just need an Aussie organizer. Volunteers?

Greg McVerry

@MarcelleHaddix We can brainstorm after you get a few days off. I am helping to organize an @indiewebcamp online and will have a better feel, but the on the ground events are always 100% free and cost about >500 (all donations) to run.

Greg McVerry

Especially members interested in radical change. The org models used by @IndieWebCamp provide a good playbook.

Type of change people want can not happen in formal orgs like LRA or universities. Not a knock, just a reality. It will take collectives

Greg McVerry

@peggysemmingson @marcellehaddix Peggy we are also organizing @indiewebcamp Austin Feb 23-24th. You should drive down. Other members interested in blogging, social media, or want to build their website should go as well.

Greg McVerry

@traceyhabla Please share this with your UTAustin family at https://indieweb.org/2019/Austin We will be at @IndieWebCamp Feb23-24 El futuro de la juventud y la familia requires digital indigenous communities. Una web solo en inglés no es una web abierta

Greg McVerry

@DVISD_SS going to be down in Austin for Feb-23-24th https://indieweb.org/2019/Austin if you wanted to get together and chat students and reflective blogging. Actually @microdotblog one of the coolest platforms for reflective writing is just down the street from you

Greg McVerry

Scoping Out Basics of #IndieWeb Search

4 min read

Over the weekend I met with the CEO of BLUR Search Technologies . Jaime is also my brother-in-Law, and has  sponsored IndieWebCamp NYC in 2018. We mainly gathered for Thanskgiving, the second Thanksgiving, and finally leftovers.

As we all played clean the fridge we snuck away to scope out a possible search engine for the IndieWeb Community. Blur Search Technologies will donate time and technology but we will need some help in implementing some building blocks  IndieAuth, Post Type Discovery Algorithm, etc.

We will also check out and see how much of indiemap.org. I think it will be a ton, plus we have data already to play with. 

Opt-in with IndieAuth

Yes many of us publish openly, even with liberal licenses that allow for remixing and forking but this does not mean we want the data scraped, parsed, and sorted. The right thing and what you have the right to do are not always the same.

Thus the first feature we would need to have would be an opt-in service using the IndieAuth protocol. Meaning the only website data the search engine would collect would be that which you authorized.

Grant Richmond has done this well with the h-card directory. Speaking of which...

Types of Tables

We first discussed what types of tables and data are available to fill these tables. We did not decide if each top level h* would get a table or we would the h* as the first column.

  • h-entry
  • h-review
  • h-feed
  • h-card
  • h-cite
  • h-feed

Again we looked at Grant Richmond's UI, but the h-card directory would get parsed as soon as someone joins the search engine.

Indexing Sites

A feed reader could then be used to index sites. Using the post type discovery algorithim and existing microformats parsers we can add columns for all the properties used in:

For large blogs with decades and gigs of post we will index the pages overtime in the background. Adding sites quickly gets more expensive even quicker.

Queries

Some queries, like those involving people would get hard coded into the search engine. You could ask:

  • Where is @x? -Then the search engine would qury the chekin posts for that person and tell you the last known location
  • Who is @x? Will present the the h-card of a person. If there is a p-note or p-summary present then a tagline will appear in the results.
  • What is @x Mastodon name? Queries the directory and finds the rel-me link
  • What (movie, book, podcast) is most popular? This would query the frequency of "p-name" in the h-cite" of any watch, read or listen post (or whatever is the corect answer, much of this is new). These queries could of course be date restricted.

Keyword Search

The keyword search would look for exact matches in:

  • first p-name after the h-*
  • p-category or rel="tag"
  • content
  • h-cite

These could then be weighted in some form of ranking

  • +100 if keyword in the p-name and alo p-category
  • +50 if p-name
  • +25 if p-category
  • +10 for each exact match in the content

Next Steps

We needed to scope out an MVP which this blog post now completes. Next we will start working on testing the different microformats to json parsers to populate tables with dynamic columns to see which can be static columns.

We will start with my blog but need a few other volunteers. Find me in chat if interested.

Update:Ryan Barret

reminded me of https://indiemap.org, which already has data to muck around in and a prior example of some crawling technology.

We also need help from people with experience using the IndieWeb building blocks.

Big Questions?

Can we add a micropub client so if you are signed into the search engine you can reply and interact with the results?

Can we develop APIs so people could add the search engine natively to their blogs for both local and network searches?

Could a private search enging help protect vunerable blogging communities by controlling not only who can use the search engine but giving uvers full control over what data is parsed?

Overall I think an opt-in search engine, where you can add and subtract your data as easy as every other time you use IndieLogIn will be great for the community. Search technologies combined with existing building blocks the already created such a search tool would be useful to other consumable feeds in the as well.

Greg McVerry

Just proposed a "nap time" session idea for the upcoming IndieWebCamp-Online. Meaning parents can give a window for their session based on normal nap schedules with understanding start and stop times may shift dramatically.

Greg McVerry

@mrkdndvs we are organizing an online @indiewebcamp and we currently don't have anyone to facilitate utc+9 to 11. Not sure if interest is there in Aussie/NZ/Pacifics but would you want to get involved?