embedding meta data for copy/paste usages - possible use case for RDF-in-HTML?

Discussion:

Hallvord R M Steen

2009-01-19 16:52:19 UTC

If this was discussed already, sorry. There has been so much RDF/meta
data discussion that I'm far from on top of it..

I'd like some way to add meta data to a page that could be integrated
with the UA's copy/paste commands.

For example, if I copy a sentence from Wikipedia and paste it in some
word processor, it would be great if the word processor offered to
automatically create a bibliographic entry.

If I copy the name of one of my Facebook "friends" and paste it into
my OS address book, it would be cool if the contact information was
imported automatically. Or maybe I pasted it in my webmail's address
book feature, and the same import operation happened..

If I select an E-mail in my webmail and copy it, it would be awesome
if my desktop mail client would just import the full E-mail with
complete headers and different parts if I just switch to the mail
client app and paste.

To make such use cases possible I suppose what we need is
a) some way to embed standardised interchangeable meta data in HTML
(so that users can copy from regular web pages)
b) some support in the UA for figuring out what meta data applies to a
selection and, say, place three alternative formats on the clipboard:
1) text/plain
2) text/html
3) application/metasomething+xml
c) support in other applications for detecting the third format on the
clipboard, parsing and using it. For example, a web application might
use the HTML5 clipboard data API to detect the meta data, parse it
with the UA's XML parser, and figure out if it was data it could make
use of.
Most applications would use *both* the regular text (plain or HTML)
format and the meta data.

Would anyone use this?

I think that actually some of the functionality we would enable here
would be so compelling that users would request it. If, for example,
Wikipedia -> OpenOffice pasting created automatic bibliography entries
users would start asking why Encyclopedia Britannica -> Microsoft Word
did not. If Myspace.com let you copy a selected contact and paste in
some webmail or OS address book, Facebook users would start several
Facebook groups trying to get it "working" there.

--
Hallvord R. M. Steen

Lachlan Hunt

2009-01-19 19:21:25 UTC

Permalink

Post by Hallvord R M Steen
I'd like some way to add meta data to a page that could be integrated
with the UA's copy/paste commands.

These use cases are a good start, but the problem is that you've begun
with the assumption that copy and paste would be a part of the solution.

Post by Hallvord R M Steen
For example, if I copy a sentence from Wikipedia and paste it in some
word processor, it would be great if the word processor offered to
automatically create a bibliographic entry.

Do you mean a bibliographic entry that references the source web site,
and included information such as the URL, title, publication date and
author names? That could be a useful feature, even if it could only
obtain the URL and title easily.

Often, when writing an article that quotes several websites, it's a time
consuming process to copy and paste the quote, then the page or article
title and then the URL to link to it. An editor with a Paste as
Quotation feature which helped automate that would be useful.

HTML5 already contains elements that can be used to help obtain this
information, such as the <title>, <article> and it's associated heading
<h1> to <h6> and <time>. Obtaining author names might be a little more
difficult, though perhaps hCard might help.

Post by Hallvord R M Steen
If I copy the name of one of my Facebook "friends" and paste it into
my OS address book, it would be cool if the contact information was
imported automatically. Or maybe I pasted it in my webmail's address
book feature, and the same import operation happened..

I believe this problem is adequately addressed by the hCard microformat
and various browser extensions that are available for some browsers,
like Firefox. The solution doesn't need to involve a copy and paste
operation. It just needs a way to select contact info on the page and
export it to an address book. There are even web services that will
parse an HTML page and output a vCard file that can be imported directly
into address book programs.

Post by Hallvord R M Steen
If I select an E-mail in my webmail and copy it, it would be awesome
if my desktop mail client would just import the full E-mail with
complete headers and different parts if I just switch to the mail
client app and paste.

Couldn't this be solved by the web mail server providing an export
feature which let the user download the email as an .eml file and open
it with their mail client? Again, I don't believe the solution to this
requires a copy and paste operation. However, I'm not sure what problem
you're trying to solve. Why would a user want to do this? Why can't
users who want to access their email using a mail client use POP or IMAP?

--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/

Hallvord R M Steen

2009-01-20 02:43:33 UTC

Permalink

Post by Hallvord R M Steen
I'd like some way to add meta data to a page that could be integrated
with the UA's copy/paste commands.

These use cases are a good start, but the problem is that you've begun with
the assumption that copy and paste would be a part of the solution.

That's not a bug, it's a feature :)

Ian said a while ago that coming up with end-user friendly UI ideas
for the RDF stuff was harder than doing the technical work - implying
that if there are no good UI ideas, browser vendors would not find a
nice way to let users use metadata, and many of the use cases for
embedding it in HTML would not really be feasible.

Thus, this *is* a UI proposal, aiming to show that an operation nearly
*all* users are familiar with could be enhanced with richer ways to
embed meta data in HTML.

Do you mean a bibliographic entry that references the source web site, and
included information such as the URL, title, publication date and author
names?

Exactly.

That could be a useful feature, even if it could only obtain the URL
and title easily.
Often, when writing an article that quotes several websites, it's a time
consuming process to copy and paste the quote, then the page or article
title and then the URL to link to it. An editor with a Paste as Quotation
feature which helped automate that would be useful.

It would be great. I hate the clumsy back-and-forward switching to
copy/paste all those bits of information ;-p

HTML5 already contains elements that can be used to help obtain this
information, such as the <title>, <article> and it's associated heading <h1>
to <h6> and <time>. Obtaining author names might be a little more
difficult, though perhaps hCard might help.

Indeed. And it's not an either-or counter-suggestion to my proposal,
UAs could fall back to extracting such data if more structured meta
data is not available.

This is way more complicated for most users. Your last sentence IMO is
not an appropriate way to use the word "just", seeing that you need to
find and invoke an "export" command, handle files, find and invoke an
"import" command and clear out the duplicated entries.. This is
impossible for several users I can think of, and even for techies like
us doing so repeatedly will eventually be a chore (even if we CAN, it
doesn't mean that's the way we SHOULD be working).

Besides, it doesn't really address the "copy ONE contact's
information" use case well.

Also, should any program that wants to support copy-and-paste of
contact information have to support text/html parsing and look for
class="" values? I guess that would be quite some work for the rather
limited functionality microformats gives you. It would be better with
a microformat-aware UA generating a common meta data interchange
format for the clipboard, and from there there it seems a small step
to allow web page authors to embed richer meta data the UA can use to
generate the clipboard meta data, right there in their HTML.

Couldn't this be solved by the web mail server providing an export feature
which let the user download the email as an .eml file and open it with their
mail client?

Of course, that or POP/IMAP access is the way things currently work.

Again, I don't believe the solution to this requires a copy
and paste operation.

..but I think it would be more intuitive and user friendly if
something like that worked. (Or drag-and-drop an E-mail from the
webmail to the desktop client/file system/other webmail, which is
basically the same thing).

However, I'm not sure what problem you're trying to
solve. Why would a user want to do this? Why can't users who want to
access their email using a mail client use POP or IMAP?

Granted this use case is a bit more far-fetched (but I know people who
copy E-mails from their Outlook and paste in Windows Explorer! - for
"backing up" or archiving a message they want to keep).

--
Hallvord R. M. Steen

Calogero Alex Baldacchino

2009-02-04 03:17:59 UTC

Permalink

Post by Hallvord R M Steen

Post by Lachlan Hunt
HTML5 already contains elements that can be used to help obtain this
information, such as the <title>, <article> and it's associated heading <h1>
to <h6> and <time>. Obtaining author names might be a little more
difficult, though perhaps hCard might help.

Indeed. And it's not an either-or counter-suggestion to my proposal,
UAs could fall back to extracting such data if more structured meta
data is not available.

I think that's a counter-suggestion, instead. If UAs can gather enough
informations from existing markup, they don't need to support further
metadata processing; if authors can put enough informations in a page
within existing markup (or markup being introduced in current
specification), they don't need to learn and use additional metadata to
repeat the same informations. It seems that any additional
metadata-related markup would add complexity to UAs (requiring support)
but not advantages (with respect to existing solutions) in this case.

Therefore, the question moves to the format to use to move such infos to
the clipboard, which is a different concern than embedding metadata in a
page. Also, different use cases should lead to different formats (with
different kind of informations taken apart in different clipboard
entries, or binded in a sort of multipart envelop to be serialized in
just one entry), because a generic format, addressing a lot of use
cases, could seem overengineered to developers dealing with a specific
use case, thus a specific format could gain support in other
applications more easily --- third parties developers could find easier
and more consistent to get access to the right infos in the right
format, either by looking for a specific entry (if supported by the OS),
or by parsing a few headers in a multipart entry looking for an offset
associated with a mime-type (which would work without requiring support
by OS's, but an OS could provide facilities to directly access to a
proper section anyway; however, any support for multiple kinds of infos
should be in scope for the OS clipbord API and/or the UA, not for a
specific application requiring specific data - and, given the above,
that should not be in scope for HTML5).

Post by Hallvord R M Steen

Post by Lachlan Hunt

It can be improved, but it's the _best_ way to do that, and should be
replicated in the "copy-and-paste" architecture you're proposing.
Please, consider a basic usability principle says users should be able
to understand what's going on basing on previous experience (that is, an
interface have to be predictable); but users aren't confident with
copying and pasting something different than text (in general), thus a
UA should distinguish among a bare "copy" option, and more specific
actions (such as "copy as quotation", "copy contact info", and so on),
and related paste options (as needed), so that users can understand and
choose what they want to do.

On the other hand, the same should happen in a recipient application,
especially if providing support for different kinds of info; if either a
UA or a recipient application (or both) provided a simple copy and a
simple paste option (or fewer options than supported, basing on metadata
or common markup) it could be confusing for users, nor should
applications use metadata to choose what to do, because the user could
just want to copy and paste some text (or do something else, but he
knows what, so he must be free to choose it).

That is, what you're proposing is mainly addressed by moving around
import/export features to put them into a context menu and making them
work on a selection of text (not eliminating and substituting them with
a "simpler" copy-paste architecture), then requiring support by other
applications and eventually by the operative system, which is definetly
out-of-scope for any web-related standards (we can constrain web-related
applications to improve their interoperability with respect to
web-related features, not generic client-only applications and/or
operative systems to create a "brand-new" interaction and
interoperability - and UA implementors wouldn't be happy to implement
something they know to be incompatible with existing platforms).

Post by Hallvord R M Steen
Besides, it doesn't really address the "copy ONE contact's
information" use case well.

Assuming social-networks service provider wanted to support it, I think
the best way to accomplish this case is to take metadata modelling a
contact info in a separate, non-html file, so to provide a better
control on sensible data and enforce privacy, and to expose it as a
linked resource, accessible with proper rights (and modified server-side
according to lower- or heigher-level rights). For instance, a nickname
in a page could be part of an anchor pointing to a homepage, with an
associated context menu linking to exportable metadata. This would work
in about every UAs: a compliant one could recognize the metadata format
while fetching it (eg through its headers, or sniffing its content),
copy it to the clipboard in an appropriate manner, and notify it to the
user, or activate a proper plugin associated with a proper mime-type (or
just ask the user what to do); a non-compliant one could just show it as
plain text, that users could copy and paste as serialized metadata in an
application supporting such a format, which could fit the purpose very
well and thus be largely supported as a contact info interchange format
(a plugin associated to a mime-type would work in this case too, as well
as telling the user what to do - save locally, open with an external
application, convert to something else if possible, and so on).

Post by Hallvord R M Steen

Post by Lachlan Hunt

Couldn't this be solved by the web mail server providing an export feature
which let the user download the email as an .eml file and open it with their
mail client?

Of course, that or POP/IMAP access is the way things currently work.

Post by Lachlan Hunt
Again, I don't believe the solution to this requires a copy
and paste operation.

I think that drag-and-drop would work better in this case, and without
necessarily requiring a clipboard mechanism (that is, differently from a
copy-and-paste operation).

For instance, a webmail interface could provide a link (named ".eml
version" or "get a copy" or "save locally" or the alike) to an eml file
(that is, to the mail entry in a server database dynamically extracted
and served with proper headers when queried -- this is yet done,
somehow, by some webmails). Given that, a user would have two choices:

1) Just follow the link, so that his UA would recognize a non-html
document and,

* open it through a plugin;
* open it through an external program;
* ask the user what to do, whithin the option to save the file locally;

this is consistent with other similar operations users are confident
with (such as opening a pdf file), and doesn't require any particular
further support, neither from the UA, nor from the OS, nor from the
recipient application. Of course, the user could also right click the
link and select "save as" (or "save target as", or whatever else it is
labeled).

2) Drag-and-drop the link to the desktop, or to another application, so
that either a symlink (or an "internet shortcut" or whatever else it is
called) or a .eml file (to avoid authentication issues) is created (on
the desktop, or in a temporary folder, as needed). This can work quite
fine when dragging to the desktop (it should create a symlink in most
platforms), and require a little more support to create a .eml file
(this is yet possible on some platforms always asking what to do with a
dragged 'object') and/or to improve direct drag-and-drop between
applications (actually it may not work or produce the same effect as
copying and pasting the resource address).

In the latter case (direct drag-and-drop), recipient applications could
recognize the dragged text as a URI and try and open it the same way
they open a local file and follow a symbolic link, thus reducing
requirements for OS support; authentication issues could be solved by
supporting http authentication (or a form-based challange) either in a
library used by recipient applications, or even in a system library
(freely implemented by an OS, without any explicit requirement), which
could be an improvement for generic resources location and access
(specially when dealing with symlinks), therefore possibly useful in
other contexts than just enhancing the interoperability between a
browser and other applications, and therefore more likely to be implemented.

The overall mechanism (normal link to a resource + normal/little
improved drag-and-drop of the link) should work very fine in most
platforms and fall back gracefully in other ones, both because in part
it works fine as is, and in part it would require, globally, fewer work
to support it than to support a copy-and-paste mechanism based on
metadata (or an effort which could be useful in different contexts than
just a rich copy-and-paste between applications, whereas it is
out-of-scope for html5, and will remain such until some experimentations
will have been made on some platforms by some UA and non-UA applications).

On the other hand, a copy-and-paste mechanism (working as if saving a
.eml file, or just putting metadata into the clipboard) would reduce or
eliminate authentication issues (though constraining a strong OS
support, which is a hard goal for a web standard), but I think that's
less usable, all things considered. Let's consider the case of a message
containing a big attachment, such as a pdf file, which is never
immediately downloaded by any webmails and IMAP clients: a UA could hang
over while downloading it as the result of a copy operation, blocking
the following paste (if not the whole OS clipboard, if locked -- not
doing so could cause a wrong paste by a user not caring that his browser
is still working and trying to paste something immediately hence, as
usually possible). I think that an immediate drag-and-drop, with the
recipient application (eventually a window manager) handling the
download is a better solution (after all, that's what usually happens
when opening a big file from a slow source).

WBR, Alex

--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Blu American Express: gratuita a vita!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8614&d=4-2

Hallvord R M Steen

2009-01-20 02:19:26 UTC

Permalink

I think that the already available solution to your problem are Microformats
- you are essentially embedding metadata, semantically in HTML.

Of course, but I think your comment misses half of the proposed
solution.. namely what format the UA puts the information on the
clipboard in.

If you say microformats is the solution, I assume you mean UAs should
put HTML fragments with microformat-type attributes and values (mainly
class) on the clipboard as text/html, and applications that were
targetted by a paste operation should have HTML parsers and implement
support for specific microformats.

Beside this, the applicability is rather specific - every application would
need built in support and every website would have to markup the data in a
specific way to support the application's format.
This could get far too confusing and complicated...

It would not necessarily need support from the website - the UA could
have some logic to create associated meta data (URL, title, possibly
author from <META> tags though that wouldn't be very reliable) for the
bibliographic stuff if the page did not contain more specific meta
data for this purpose.

With Facebook I could write a Facebook application to generate the
meta data format - Facebook would not really need to support this.

With any other website I could add a User JavaScript or Greasemonkey
script that was aware of that site's markup and could extract the
information in a site-specific way and make it available to the UA as
HTML-embedded meta data..

--
Hallvord R. M. Steen

Lachlan Hunt

2009-01-20 11:40:23 UTC

Permalink

Post by Hallvord R M Steen

I think that the already available solution to your problem are Microformats
- you are essentially embedding metadata, semantically in HTML.

Of course, but I think your comment misses half of the proposed
solution.. namely what format the UA puts the information on the
clipboard in.

Determining how one application passes information via the clipboard to
another application seems very much out of scope of HTML. If there was
such a method available, then we could investigate how to obtain the
relevant semantics from the document. But we can't do that until there
is some clipboard format available for this purpose that other
applications can understand.

I doubt that it would be possible to create some generic format that
would be suitable for such a wide range of use cases. For an address
book application, the most sensible approach would be to add a vCard
format (text/directory;profile=vcard) to the clipboard. Given that many
address books already support the vCard file format, it's not such a
stretch to believe that they could read the data in that format from the
clipboard. The problem is getting them to support an import from
clipboard feature.

However, other use cases, like pasting a quote into a word processor
complete with bibliographic information, would need an entirely
different format.

--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/

Hallvord R M Steen

2009-01-30 15:53:36 UTC

Permalink

Post by Lachlan Hunt

Post by Hallvord R M Steen
Of course, but I think your comment misses half of the proposed
solution.. namely what format the UA puts the information on the
clipboard in.

Determining how one application passes information via the clipboard to
another application seems very much out of scope of HTML.

If we keep considering clipboard support "out of scope" it means web
applications will continue to SUCK at copy/paste support. Copy&paste /
drag&drop is a UI workhorse the Web and its applications generally
can't take much advantage of now. We should do something about that
(where "we" is not necessarily the WHATWG/HTML5 WG, it might also and
possibly more likely be a WebApps WG task - I don't care where it's
done as long as it is done.).

Post by Lachlan Hunt
If there was such
a method available, then we could investigate how to obtain the relevant
semantics from the document. But we can't do that until there is some
clipboard format available for this purpose that other applications can
understand.
I doubt that it would be possible to create some generic format that would
be suitable for such a wide range of use cases.

That, of course, is what the RDF people claim to be doing. Whether it
makes sense and would get used I have no idea, but implementing some
rudimentary support for putting some RDF-markup on the clipboard and
retrieving it would let the Web have a go at figuring out if it IS
usable for information exchange, and shouldn't take too much work if
the generic clipboard API is in place. That's why I like this idea -
from my naturally browser-vendor-centric perspective :-)

Post by Lachlan Hunt
For an address book
application, the most sensible approach would be to add a vCard format
(text/directory;profile=vcard) to the clipboard.

I assume you'll answer the "where should the UA find the structured
information in order to place it on the clipboard" question with
"vCard microformat".

--
Hallvord R. M. Steen