Either you have separation of data and presentation or you don't. HTML didn't. H...

halo · on Nov 8, 2011

HTML is a structured document format, with CSS for styling and presentation.

The web became so popular that developers want HTML to be an unstructured data format with a separate fully-fledged layout language.[1]

Neither idea is wrong, there's just an impedance mismatch between the two.

There are three options:

1. The status quo. Developers mangle technologies designed for something different so it does what they want, with an increasing use of libraries and abstractions. It's inefficient and messy, but it works and is backwards compatible.

2. Give up on the idea of structured data and try to mangle HTML/CSS to be closer to what developers want. You end up with a crap structured document language and a crap web application language, but the web application language will be slightly less crap than the status quo.

3. Create a new language actually designed for web applications with a proper layout engine and HIGs to go with it. Despite being the most sensible option, this isn't realistically going to happen. A lack of backwards compatibility kills adoption and browser makers will never agree. It took them a decade to agree on a font format.

None of the options are ideal. I suspect somewhere between 1 and 2 will happen in reality.

[1] Layout is also a really hard problem. I don't think it's been acceptably solved in the general case.

srl · on Nov 8, 2011

Amen.

More disturbing still: because those people who are learning to program /now/ - in this decadent era of webapps and mobile devices - learn the brain damage that is modern web development early on. And it turns out (sorry, no evidence here but my own observations) that when you're first learning something, it's easy to mistake a brain-damaged system for an elegant one. And so, yes, there are people learning to program today, holding up CSS as an example of elegance.

To be honest, I wouldn't complain about CSS so much - even if it lacked the much-seeked "separation of data and presentation" (isn't there a buzzword for that somewhere?) - if only accomplishing simple tasks didn't require obscure, unreliable hacks. As long as CSS was capable of formatting a reasonably-constructed document (titles before text, left-hand elements before right-hand elements) without hacks, I'd be happy. But it's not. And it doesn't look like it will be any time in the near future.

rimantas · on Nov 8, 2011

  > As long as CSS was capable of formatting a 
  > reasonably-constructed document (titles before text,
  > left-hand elements before right-hand elements) without
  > hacks, I'd be happy.

What hacks do you need to do that?

joelanman · on Nov 8, 2011

Not sure what is meant by titles before text, but it's impossible with css to reliably change the order of elements on the screen. You can try to float them differently, but this introduces side-effects, and won't work well with more than two elements.

talmand · on Nov 8, 2011

Of course it's impossible, because CSS was never intended to do that in the first place. Seems to me that too many people are trying to use HTML/CSS in ways it was never meant to be and then complaining about the fact it doesn't work as expected.

I'm totally shocked that my car won't drive up the side of my office building.

If you are wanting to move stuff around in an HTML document with CSS you'll have to go with absolutely positioning every element within a container. Even then you'll need javascript to change classes and/or styles of the elements to accomplish that.

sirclueless · on Nov 8, 2011

That's kind of the whole point. Why even pretend that HTML is data and CSS is presentation when the presentation language can't even reliably order things on the screen?

You claim that "CSS was never intended to do that" but the goal of CSS is explicitly to manage the rendering of a document on a given device/browser. Obviously it was intended to manage layout. Deciding which order to show sections is absolutely something a presentation layer should be able to manage, and CSS can't do it.

talmand · on Nov 9, 2011

I pretend nothing, HTML is for structured data and CSS is for presentation of that data. Just because it doesn't do what you want doesn't mean the definitions are wrong.

CSS was never intended to manage layout, thus it has very little tools to do so. The HTML was intended to manage the layout. CSS changes the presentation of the document as structured by the underlining HTML.

HTML controls the order of elements on the page quite well. What you are wanting is a reliable method to CHANGE the order of elements on the page. That is what I mean that CSS was never intended to do in the first place. HTML/CSS were developed on the idea of structured documents that do not change in real-time.

You are wanting to take methods from a totally different set of standards and force-feed them onto this standard. Web pages were created to be static documents, much like printed pages, not applications.

If you are wanting to control the order of elements on the page in real-time then I would suggest you look into having all your elements absolutely positioned inside a container. Then you can use javascript to move and hide elements all you want. Just keep in mind the pros and cons of doing that.

bmelton · on Nov 8, 2011

With 960.gs, it's quite easy to do this with push_ and pull_ directives.

It doesn't change, semantically, the order of the markup obviously, but that's to be expected.

einhverfr · on Nov 8, 2011

I think you are missing the fact that a good document really needs three things:

1) Data (RDF, NoSQL, RDBMS, etc)

2) Structural presentation (i.e. what HTML or LaTeX provides) and

3) Presentation to the user (i.e. what CSS or macro packages in LaTeX provide)

To get from 1 to 2, you have to have some logic. You could do it with Javascript acting against RDF and HTML, I suppose. Or you could do it with XSLT. However there's no inherent guarantee that inherent data structure will in any way match your document structure and so these are really separate concerns.

This is why HTML template systems are so important for web programming.

alex_c · on Nov 8, 2011

>Either you have separation of data and presentation or you don't. HTML didn't. HTML5 still doesn't.

Is it really so desirable? I understand the appeal in theory, but in practice is it really worth it in most cases?

Or perhaps I'm so scarred by HTML and CSS that I can't even visualize a web with true separation of data and presentation that actually makes life better.

Edit: some of the other comments discuss this. Of course I see the advantage of abstraction and reusability. What I'm really asking about is the advantage, or even feasibility, of a pure, or strict, separation of data and presentation.

jeromeparadis · on Nov 8, 2011

Totally agree.

In my opinion, the problem of separation of data and presentation won't be solved by markup or CSS.

If a Web page is to contain data and a service wants to act on this data, it has to scrape the Web page. Which is even harder with scripted pages. But scraping data isn't a solution. The semantic web may try to have web developers bring sense to data on a web page, but the problem remains. It's just a markup patch. It's doesn't define how to act on that data. Web Intents are just another patch to markup to bring verbs.

Direct access to the data sources with well define methods to act on that data and interact with it (instead of using form) is what works today through APIs. What doesn't work is that there aren't a lot of open standards APIs. Most well used APIs are proprietary and Facebook's a good example. I believe standard bodies should put their brains and efforts on defining API standards. Some standards APIs some clunky, are in use in B2B in the back ends but there's not much of it on the consumer facing Web. We must move forward to push separation of presentation, data and verbs for the whole Web, one small step at a time.

If most use cases on the Web used standard APIs, we would have true separation of content and presentation. We would even have the verbs to exchange/create understandable content. Then, you can use HTML/CSS to adapt a UI to any device with true separation.

That's the way we build apps and sites today and with standard, it would pave the way to a more exciting future.

So one day, if I want to have my own customized UI for that new holographic/gesture recognition device to shop with my preferred merchants, I just have to build an app and I'll be able to browse their merchandise, sort it like I want and finalize a transaction without even visiting their Web site.

danssig · on Nov 8, 2011

>If a Web page is to contain data and a service wants to act on this data, it has to scrape the Web page.

This is completely the wrong approach and this isn't how people who know what they're doing work now.

A web page is just a presentation layer. If you have a service that wants data, it needs to work with a model or presenter/controller layer. On the web this can be a REST service, SOAP, something proprietary, etc. Ideally, the web site will be using this same source to get its data.

If the web application presents data via a web interface and doesn't offer a presentation/controller layer to allow you to access that same underlying API, then yes, you will have to scrape if you want that data for some reason. But I don't see this as wrong, you're doing something the owners of the data didn't intend for you to do. You'll have similar issues if you want to get data out of any application view (e.g. screen scraping a windows native app).

EDIT: Read the rest of your post and I see that you addressed much of this already. I still maintain that this is already how people are working who want others to use their data.

epochwolf · on Nov 8, 2011

> I believe standard bodies should put their brains and efforts on defining API standards.

Have you seen SOAP?

DuncanIdaho · on Nov 8, 2011

He said put brains into it.

Not in a sense: "Throw your brains in for zombies to have a party". But in a sense that one should try and think and find protocols which are elegant in a sense that it makes reasoning about and using them easy and simple.

I agree that coming up with SOAP and XML-RPC took quite some brains and effort, too bad that some really good people had to be lobotomized for it.

jeromeparadis · on Nov 8, 2011

Yes. I prefer something more lightweight. But SOAP is a protocol. What I propose are standards for common use cases that are not as general purpose.

Example, Facebook API allows querying, interacting with the social graph, profiles, photos, feeds, events, etc. These are use cases commonly used on photo apps, social networks, eventing, etc. But it's proprietary. Now imagine an open source standard similar to that but that can define such building blocks including other scenarios such as contacting a web site owner (about page, contact page), querying/posting articles to a web site, querying/doing transactions with products/services, etc. Once you go through all scenarios, then the problem that remains will be more about agents/authorities/reputation/security of allowing someone to interact with services. With better access for apps to interact directly with content by bypassing the current web presentation layer to avoid spam/fraud.

gambler · on Nov 8, 2011

"HTML1.0 was a local optimum. Somewhere there is another, better optimum. The W3C appears to be willing to travel the Himalayas of suboptimal to find it."

I don't think they're even going in the general direction of better optimum.

Here is what I think that optimum should include:

- Better document model (vs no document model at all, which seems to be where it's all headed right now).

- Separation of content, layout and styling. Yes, into three parts, rather than the two we have right now.

- Partial caching and user-side includes that aren't a blob of ugly, shortsighted hacks. Something that's works with the document model, not against it. It's absolutely ridiculous that I have to write custom code to prevent the browser from re-downloading (and the server from re-generating) page headers so on.

- Significant improvement of forms. New UI elements, support for pure-HTML put/delete requests, different format for sending data that has structure (vs only key-value pairs).

einhverfr · on Nov 8, 2011

Generally agreed, but would suggest a slightly different set of concerns separated.

instead of separation of content, layout, and styling, I would suggest a separation of data, structure, and styling. Data + structure gives you content, structure plus styling gives you layout.

So in this idea you might have RDF as data, an HTML template as structure, and CSS as styling. The browser would generate the HTML from the template and the RDF, and the CSS would then be used to lay it out and style it.

fooandbarify · on Nov 8, 2011

While I certainly agree that there is tons of room for improvement with HTML/CSS/JS, I get confused when people start discussing it in such hyperbolic terms. It's not that bad. Most data on the internet fits really nicely into the document metaphor.

knieveltech · on Nov 8, 2011

Normally I immediately bridle when someone offers criticism without including proposals for improvement[1] but in this case it really is that bad. It really, really is.

And while I would be willing to agree provisionally that most data (by volume of unique URLS) on the web does fit the document metaphor, if page views is your metric I'm not at all convinced.

For example, is it logical to even attempt to reason about a Facebook wall containing recent updates from $n individuals in terms of authorship? Is this even relevant information given the entire contents of the page will have changed in 12 hours?

The document metaphor made perfect sense 15 years ago but it breaks down quickly in the face anything dynamic, as is evidenced by the need for any credible web developer to have a minimum of 7 largely unrelated technologies[2] committed to memory to do their job effectively.

Markup 15 layers deep? 2000+ lines of code to tell the browser how to render a website? Vendor-specific dynamic rendering engines to sidestep the limitations of native web languages? Surely this is not what success looks like?

[1] Unfortunately I have no idea how to fix this mess.

[2] HTML, CSS, JavaScript, a JS Framework (typically jQuery), at least one back-end language, SQL or similar and API stuff (SOAP, JSON, etc).

fooandbarify · on Nov 8, 2011

Hmmm. I appreciate the way you laid out your argument, I'm not sure I understand it entirely though. I agree that the concept of authorship is not very logical in the context of your example, but I don't know of any HTML spec which requires defining an author for each document? I think my point was that it's pretty easy to mark up data in a semantic-enough way using the current tools. A wall post doesn't need to be a document of its own, it can be an item in a list of wall posts that make up part of a bigger page. I agree that "document" is sort of a silly metaphor for that use case, but that doesn't make HTML any less useful. We could use XML, but that would basically be the same thing. We could use JSON, but again... it's just another way of drawing the same relationships. I suspect that I've completely missed your point though, in which case I apologize and please be patient with me!

With regard to the rest of your comment (7 unrelated technologies, deep layers of markup, huge numbers of LOC, etc) I agree, but as you said - how could it be fixed? The reality is that the web performs a complicated function. It would be nice to abstract the nuts and bolts behind it away cleanly, and I don't think it's unreasonable to believe that could happen in our lifetime, but it's also not unreasonable to expect that developing for a complicated platform will be complicated.

einhverfr · on Nov 8, 2011

I think it is this unrelated technologies bit that is driving things like NoSQL.

Interestingly, with LedgerSMB, you have to know: HTML, CSS, Javascript, Tempalte Toolkit, Perl, SQL (including PL/PGSQL), LaTeX

That's only 7..... When we standardize on an AJAX API framework, I guess that will mean 8. However LaTeX is only required by some specialists (customizing printed check and PDF invoice templates), and a lot of the current approach is to hide the AJAX stuff inside TT widgets, meaning no more than 7 for most developers.

And since the LaTeX stuff is a specialty (customizing higher-end printed templates and printed checks), that leaves only 6 for most developers. And since the SQL stuff can be easily handed off to others in the community (because the db API is defined through SQL, mostly through a procedural interface), it means 5. The perl is thin glue, and probably should hardly count (unless you are engineering the framework). A few master them all. Most work with the framework we provide. So here you have to know 4 well to do basic customizations, but 7 well to do the most advanced.

Works pretty well, actually.