Archive: May 8th, 2008

Why You Should Have a Web Site

The enigmatic is at XTech to tell us Why you should have a Web site: it’s the law! (and other Web 3.0 issues). God, I hope he’s using Web 3.0 ironically.

Steven has heard many predictions in his time: that we will never have LCD screens, that digital photography could never replace film, etc. But the one he wants to talk about is Moore’s Law. People have been seeing that it hasn’t got long to go since 1977. Steven is going to assume that Moore’s Law is not going to go away in his lifetime.

In the 1980s the most powerful computers were the Crays. People used to say One day we will all have a Cray on our desk. In fact most laptops are about 120 Craysworth and mobile phones are about 35 Craysworth.

There is actually an LED correlation to Moore’s Law (brighter and cheaper faster). Steven predicts that within our lifetime all lighting will be LCDs.

Bandwidth follows a similar trend. Jakob Nielsen likes to claim this law; that bandwidth will double every year. In fact the timescale is closer to 10.5 months.

Following on from Moore’s and Nielsen’s laws, there’s Metcalfe’s Law: the value of a network is proportional to the square of the number of nodes. This is why it’s really good that there is only one email network and bad that there are so many instant messenger networks.

Let’s define the term Web 2.0 using Tim O’Reilly’s definition: sites that gain value by their users adding data to them. Note that these kinds of sites existed before the term was coined. There are some dangers to Web 2.0. When you contribute data to a web site, you are locking yourself in. You are making a commitment just like when you commit to a data format. This was actually one of the justifications for XML — data portability. But there are no standard ways of getting your data out of one Web 2.0 site and into another. What if you want to move your photos from one website to another? How do you choose which social networking sites to commit to? What about when a Web 2.0 site dies? This happened with MP3.com and Stage6. Or what about if your account gets closed down? There are documented cases of people whose Google accounts were hacked so those accounts were subsequently shut down — they lost all their data.

These are examples of Metcalfe’s law in action. What should really happen is that you keep all your data on your website and then aggregators can distribute it across the Web. Most people won’t want to write all the angle brackets but software should enable you to do this.

What do we need to realize this vision? First and foremost, we need machine-readable pages so that aggregators can identify and extract data. They can then create the added value by joining up all the data that is spread across the whole Web. Steven now pimps RDFa. It’s like microformats but it will invalidate your markup.

Once you have machine-readable semantics, a browser can do a lot more with the data. If a browser can identify something as an event, it can offer to add it to your calendar, show it on a map, look up flights and so on. (At this point, I really have to wonder… why do the RDFa examples always involve contact details or events? These are the very things that are more easily solved with microformats. If the whole point of RDFa is that it’s more extensible than microformats, then show some examples of that instead of showing examples that apply equally well to hCalendar or hCard)

So rather than putting all your data on other people’s Web sites, put all your data on your Web site and then you get the full Metcalfe value. But where can you store all this stuff? Steven is rather charmed by routers that double up as web servers, complete with FTP. For a personal site, you don’t need that much power and bandwidth. In any case, just look at all the power and bandwidth we do have.

To summarise, Web 2.0 is damaging to the Web. It divides the Web into topical sub-webs. With machine-readable pages, we don’t need those separate sites. We can reclaim our data and still get the value. Web 3.0 sites will aggregate your data (Oh God, he is using the term unironically).

Questions? Hell, yeah!

Kellan kicks off. Flickr is one of the world’s largest providers of RDFa. He also maintains his own site. Even he had to deal with open source software that got abandoned; he had to hack to ensure that his data survived. How do we stop that happening? Steven says we need agreed data formats like RDFa. So, Kellan says, first we have to decide on formats, then we have to build the software and then we have to build the aggregators? Yes, says Steven.

Dan says that Web 2.0 sites like Flickr add the social value that you just don’t get from building a site yourself. Steven points to MP3.com as a counter-example. Okay, says Dan, there are bad sites. Simon interjects, didn’t Flickr build their API to provide reassurance to people that they could get their data out? Not quite, says Kellan, it was created so that they could build the site in the first place.

Someone says they are having trouble envisioning Steven’s vision. Steven says I’m not saying there won’t be a Flickr — they’ll just be based on aggregation.

Someone else says that far from being worried about losing their data on Flickr, they use Flickr for backup. They can suck down their data at regular intervals (having written a script on hearing of the Microsoft bid on Yahoo). But what Flickr owns is the URI space.

Gavin Starks asks about the metrics of energy usage increases. No, it drops, says Steven.

Ian says that Steven hit on a bug in social websites: people never read the terms of service. If we encouraged best practices in EULAs we could avoid worst-case scenarios.

Someone else says that our focusing on Flickr is missing the point of Steven’s presentation.

Someone else agrees. The issue here is where the normative copy of your data exists. So instead of the normative copy living on Flickr, it lives on your own server. Flickr can still have a copy though. Steven nods his head. He says that the point is that it should be easy to move data around.

Time’s up. That was certainly a provocative and contentious talk for this crowd.

Data Portability For Whom?

It’s time for my second Gavin of the day at XTech. Gavin Bell asks Data portability for whom?

To start with, we’ve got a bunch of great technologies like OpenID and OAuth that we’re using to build an infrastructure of openness and portability but right now, these technologies don’t interoperate very cleanly. Getting a show of hands, everyone here knows of OpenID and OAuth and almost everyone here has an OpenID and uses it every week.

But we’re the alpha geeks. We forget how ahead of the curve we are. Think of RSS. We imagine it’s a widely-accepted technology but most people don’t know what it is. That doesn’t matter though as long as they are using RSS readers and subscribing to content; people don’t need to know what the underlying technology is.

Clay Shirky talked about cognitive surplus recently. We should try to tap into that cognitive surplus as Wikipedia has done. Time for some psychology.

Cognitive psychology as a field is about the same age as the study of artificial intelligence. A core tool is something called a schema, a model of understanding of the world. For example, we have a schema for a restaurant. They tend to have tables, chairs, cutlery, waiters, menus. But there is room for variation. Chinese restaurants have chopsticks instead of knives and forks, for example. We have a schema for the Web involving documents that reside at URLs. Schema congruence is the degree to which our model of the world matches the ideal model of the world.

Schemas change and adapt. Our idea of what a mobile phone is, or is capable of, has changed in the last few years. Schemas teach us that gradual change is better than big bang changes. We need a certain level of stability. When you’re pushing the envelope and changing the mental model of how something can work, you still need to support the old mental model. A good example of mental model extension is the graceful way that Flickr added video support. However, because the change was quite sudden, a portion of people got very upset. Gradual change is less scary.

Cognitive dissonance, a phrase that is often misused, is the unfortunate tension that can result from holding two conflicting thoughts at the same time. On the web, the cognitive dissonance of seeing content outside its originating point is dissipating.

J.J. Gibson came up with the idea of affordances. Chairs afford sitting on. Cups afford liquid to be poured into them. When we’re using affordances, it’s important to stick to common convention. If, on a website, you use a plus sign to allow someone to add something to a cart, you shouldn’t use the same symbol later on to allow an image to be enlarged.

Flow is the immensely enjoyable state of being fully immersed in what you’re doing. This is like the WWILFing experience on Wikipedia. You get it on Flickr too. Now we’re getting flow with multiple sites as we move between del.icio.us and Dopplr and Twitter, etc. Previously we would have experienced cognitive dissonance. Now we’re pivoting.

B.F. Skinner did a lot of research into reinforcement. We are sometimes like rats and pigeons on the Web as we click the buttons in an expectation of change (refreshing RSS, email, etc.).

Experience vs. features …don’t be feature led. A single website is just one part of people’s interaction with one another. Here’s the obligatory iPod reference: they split the features up so that the bare minimum were on the device and the rest were put into the iTunes software.

We’ve all lost count of the number of social networks we’ve signed up to. That’s not true of — excuse me, Brian — regular people. Regular people won’t upgrade their browser for your website. Regular people won’t install a plug-in for their browser. We shouldn’t be trying to sell technologies like OpenID, we should be making the technology invisible.

Gavin uses Leslie’s design of the Satisfaction sign-up process as an example. She never mentions hCard. Nobody needs to know that.

We’re trying lots of different patterns and we often get it wrong. The evil password antipattern signup page on the Spokeo website is the classic example of getting it wrong.

We must remember the hinternet. Here’s a trite but true example: Gavin’s mum …she doesn’t have her own email address. She shares it with Gavin’s dad. According to most social network sites, they are one person. And be careful of exposing stuff publicly that people don’t expect. Also, are we being elitist with things like OpenID delegation that is only for people who have their own web page and can edit it?

Our data might be portable but what about the context? If I can move a picture from Flickr but I can’t move the associated comments then what’s the point?

We’re getting very domain-centric. It would be great if everyone was issued with their own domain name. Most people don’t even think about buying a domain name. They might have a MySpace page or Facebook profile but that’s different.

Some things are getting better. People have stopped mentioning the http:// prefix. But many people don’t even see or care about your lovely URL structure. Anyway, with portable data, when you move something (like a blog post), you lose the lovely URL path.

Larry Tesler came up with the law of the conversation of complexity. There is a certain basic level of complexity. We are starting to build this basic foundation with OpenID and OAuth — they could be like copy and paste on the desktop.

We built a Web for us, geeks, but we built it in a social way. We are discoverable. We live online. This lends itself well to smaller, narrower, tailored services like Dopplr for travel, Fire Eagle for location, AMEE for carbon emissions. But everything should integrate even better. Why can’t clicking “done” in Basecamp generate an invoice in Blinksale, for example? If they were desktop applications, we’d script something. Simon interjects that if they were open source, we would modify them. That’s what Gavin is agitating for. The boundaries are blurring. We have lots of applications both on and off the Web but they are all connected by the internet. People don’t care that much these days about what application they are currently using or who built it; it’s the experience that’s important.

Here’s something Gavin wants somebody to make: identity brokerage. This builds on his id6 idea from last year. That was about contact portability. Now he wants something to deal with all the invitations he gets from social networks. Now that we’ve got OpenID, why can’t we automate the acceptance or rejection of friend requests?

We are heading towards a distributed future. DiSo points the way. But let’s learn from RSS and make the technology invisible. We need to make sense of the Web for the people coming after us. That may sound elitist but Gavin doesn’t mean it to be.

Kellan asks if we can just change the schema. Gavin says we can but we should change it gradually.

Step-by-step reassurance is important. Get the details right. Magnolia is starting to get this right with its sign-in form which lists the services you can sign in through, rather than the technology (OpenID).

We are sharing content, not making friends. Dopplr gets this right by never using the word friend. Instead it lists people with whom you share your trips. The Pownce approach of creating sub-groups from a master list is close to how people really work.

Scaffolding and gradual change are important. As a child, we are told two apples plus three apples is five apples. Later we learn that two plus three equals five; the scaffolding is removed. We must first build the scaffolding but we can remove it later.

Gavin wraps up and even though the time is up, the discussion kicks off. Points and counterpoints are flying thick and fast. The main thrust of the discussion is whether we need to teach the people of the hinternet about they way things work or to hide all that stuff from them. There’s a feast of food for thought here.

Ni Hao, Monde: Connecting Communities Across Cultural and Linguistic Boundaries

Simon Batistoni is responsible for Flickr’s internationalisation and he’s going to share his knowledge here at XTech. Flickr is in a lucky position; its core content is pictures. Pictures of cute kittens are relatively universal.

We, especially the people at this conference, are becoming hyperconnected with lots of different ways of communicating. But we tend to forget that there is this brick wall that many of us never run into; we are divided.

In the beginning was the Babelfish. When some people think of translation, this is what they think of. We’ve all played the round-trip translation game, right? Oh my, that’s a tasty salad becomes that’s my OH — this one is insalata of tasty pleasure. It’s funny but you can actually trace the moment where tasty becomes of tasty pleasure (it’s de beun gusto in Spanish). Language is subtle.

It cannot really be encoded into rules. It evolves over time. Even 20 years ago if you came into the office and said I had a good weekend surfing it may have meant something different. Human beings can parse and disambiguate very well but machines can’t.

Apocraphyl story alert. In 1945, the terms for Japanese surrender were drawn up using a word which was intended to convey no comment. But the Japanese news agency interpreted this as we ignore and reported it as such. When this was picked up by the Allies, they interpreted this as a rejection of the terms of surrender and so an atomic bomb was dropped on Hiroshima.

Simon plugs The Language Instinct, that excellent Steven Pinker book. Pinker nails the idea of ungrammatically but it’s essentially a gut instinct. This is why reading machine translations is uncomfortable. Luckily we have access to language processors that are far better than machines …human brains.

Here’s an example from Flickr’s groups feature. The goal was to provide a simple interface for group members to translate their own content: titles and descriptions. A group about abandoned trains and railways was originally Spanish but a week after internationalisation, the group exploded in size.

Here’s another example: 43 Things. The units of content are nice and succinct; visit Paris, fall in love, etc. So when you provide an interface for people to translate these granular bits, the whole thing snowballs.

Dopplr is another example. They have a “tips” feature. That unit of content is nice and small and so it’s relatively easy to internationalise. Because Dopplr is location-based, you could bubble up local knowledge.

So look out for some discrete chunks of content that you can allow the community to translate. But there’s no magic recipe because each site is different.

Google Translate is the great white hope of translation — a mixture of machine analysis on human translations. The interface allows you to see the original text and offers you the opportunity to correct translations. So it’s self-correcting by encouraging human intervention. If it actually works, it will be great.

Wait, they don’t love you like I love you… Maaa-aa-a-aa-aa-a-aa-aaps.

Maps are awesome, says Simon. Flickr places, created by Kellan who is sitting in front of me, is a great example of exposing the size and variation of the world. It’s kind of like the Dopplr Raumzeitgeist map. Both give you an exciting sense of the larger, international community that you are a part of. They open our minds. Twittervision is much the same; just look at this amazing multicultural world we live in.

Maps are one form of international communication. Gestures are similar. We can order beers in a foreign country by pointing. Careful about what assumptions you make about gestures though. The thumbs up gesture means something different in Corsica. There are perhaps six universal facial expressions. The game Phantasy Star Online allowed users to communicate using a limited range of facial expressions. You could also construct very basic sentences by using drop downs of verbs and nouns.

Simon says he just wants to provide a toolbox of things that we can think about.

Road signs are quite universal. The roots of this communication stretches back years. In a way, they have rudimentary verbs: yellow triangles (“be careful of”), red circles (“don’t”).

Star ratings have become quite ubiquitous. Music is universal so why does Apple segment the star rating portion of reviews between different nationality stores? People they come together, people they fall apart, no one can stop us now ‘cause we are all made of stars.

To summarise:

  • We don’t have phasers and transporters and we certainly don’t have universal translators. It’s AI hard.
  • Think about the little bits of textual content that you can break down and translate.

Grab the slides of this talk at hitherto.net/talks.

It’s question time and I ask whether there’s a danger in internationalisation of thinking about language in a binary way. Most people don’t have a single language, they have a hierarchy of languages that they speak to a greater or lesser degree of fluency. Why not allow people to set a preference of language hierarchy? Simon says that Flickr don’t allow that kind of preference setting but they do something simpler; so if you are on a group page and it isn’t available in your language of choice, it will default to the language of that group. Also, Kellan points out, there’s a link at the bottom of each page to take you to different language versions. Crucially, that link will take you to a different version of the current page you’re on, not take you back to the front of the site. Some sites get this wrong and it really pisses Jessica off.

Someone asks about the percentage of users who are from a non-English speaking country but who speak English. I jump in to warn of thinking about speaking English in such a binary way — there are different levels of fluency. Simon also warns about taking a culturally imperialist attitude to developing applications.

There are more questions but I’m too busy getting involved with the discussion to write everything down here. Great talk; great discussion.

AMEE — The World’s Energy Meter

Gavin Starks, the man behind AMEE — the Avoiding Mass Exctinction Engine — is back at XTech this year. The service was launched at XTech in Paris last year.

Data providers have been added in the last year, including the Irish government. There’s also a bunch of new sources that are data mined. There are plenty of consumers too, including Google and change.ie from the Irish government. It’s cool to have countries on board. Here’s Edenbee. Yay! Gavin really likes it. The Carbon Account is another great one. But Gavin’s favourite is probably the Dopplr integration.

AMEE is tracking 850,000 carbon footprints now. That’s all happened in 12 months. There are over 500 organisations and individuals using AMEE. That’s over 500 calls to Gavin’s mobile number which he made available on the website.

Gavin describes AMEE as a neutral aggregation platfrom. The data is provided from agencies that can license or syndicate their data. This data is then used by developers who can build products and services on top of it. So AMEE is, by design, commercially enabling to 3rd parties.

Gavin says they are trying to catalyse change. They want to create a standard for measuring carbon emissions. To a large extent, they’ve achieved that. Even though there are lots of different data providers, AMEE provides a single point of measurement. The vision is to measure the CO2 emissions of everything. That’s a non-trivial task so they’ve concentrated solely on doing that one thing.

AMEE has profiles for your carbon identity and your energy identity but both are deliberately kept separate. The algorithms for energy measurement might change (for example, how carbon emissions from flights are measured) but your carbon identity should remain constant. This separation allows for real data portability e.g. integrating your Dopplr account with your Edenbee account. AMEE takes care of tracking energy but they don’t care about who you are: everything is anonymous and abstracted. It’s up to you as a developer of social apps to take care of establishing identity. There’s a lot of potential here, kind of like Fire Eagle; a service that concentrates on doing one single thing really well.

They’re partnering on tracking technology. For example, tracking Blackberries and using the speed of travel to guess what mode of transport you are using at any one time.

AMEE has a RESTful API that returns XML and JSON. They also provide more complicated, Enterprise-y stuff to please the Java people.

There are different pricing models. Media companies pay more than other companies. Charities pay nothing.

What’s next? AMEE version 2; making it easier for people to engage with the service. In the long term, let’s go after all the products that exist. Someone has that data in a spreadsheet somewhere — let us get at it.

Why do all this? Why do you think? Does anybody really need to be convinced about climate change at this stage? There will always be debate in science but even senior conservative scientists are coming out and saying that they may have underestimated the impact of carbon emissions. If a level of 450ppm continues long enough (and that’s the level we’re aiming for), that’s a sea rise of up to 75 metres. That’s an exctinction level event. We might well be fucked but as Stephen Fry says:

Doing nothing risk everything and gains comparitively little, doing something risks comparitively little and gains the whole world.

Here’s where AMEE comes in: if we can measure and visualise energy consumption change, that will drive social change. In the long term we will have to completely re-engineer our lifestyles and re-invent the power grid. Shut down power stations, shut down oil platforms, reduce all travel …measure and visualise all of it.

We don’t just need change; we need a systematic redesign of the future. We could start with the political language we use. Instead of using the word “consumer” with its positive connotations, let’s say “waster” which is more accurate.

What will you build? www.amee.cc

The Pownce Blog » Blog Archive » Public file sharing and increased file sizes!

Here are the fruits of the latest code push at Pownce: the ability to share files with the public and a tenfold increase in the file size limit.