Tag Archives: climate

Thanks, Dr. Pielke!

This week Roger Pielke Sr. retired his weblog. I just wanted to thank him for the effort he had put into it. I suspect he informed and influenced far more people than he knows.

I’ve done professional work in both sustainability and energy, and have been personally compelled to stay on top of climate science. Having a career of experience with data, complex systems and computer models, it’s been obvious to me for awhile that a) the climate is a very complex system, and we don’t fully understand it, b) the data we have about our climate covers a very small window of time, and while the quality of our data continues to improve, simple questions like “what’s the average temperature” are non-trivial and prone to unexpectedly high error bars, and c) predicting the climate future relies on very complex computer models, that have not yet shown that we should trust all of their output.

With this personal perspective and a polarized environment where both sides routinely make absolute claims about non-absolute results, Dr. Pielke was a source of perspective and a guide through important, recent papers and results. I valued his open-mindedness and intellectual honesty; he was never afraid to hear opposing views or give them visibility.

Dr. Pielke recorded his core beliefs here – they’re worth a read. One thing of note is his opinion that our strong focus on GHG emissions is keeping us from giving other human impacts on the environment their due – something which deserves attention.

Thanks again, Dr. Pielke, for the past work on your blog!

Finally, we are fortunate to still have Roger’s son, Roger Jr. blogging regularly. I am a Breakthrough Fellow with Roger Jr. and alway learn from him whenever I get to read him on the web or interact with him in person.

Rebuilding After Sandy: Can We Do Better On Energy?

Over at Huffington Post my friend Bernard David asks an important question: As we rebuild after Sandy, what are we going to do different than before? Are we going to just rebuild what was there previously, or consciously decide to make changes that will reduce the impact of future natural disasters?

While the debate will continue about the degree to which Sandy was or wasn’t influenced by human-induced climate change, for the purpose of this discussion I join Roger Pielke, Jr and others in arguing that it doesn’t matter. There have always been storms as strong and stronger than Sandy hitting our eastern seaboard, and there will be more in the future. In addition to coastal storms, we will surely face earthquakes, blizzards, and other types of natural disasters. And as Roger says, “There are more people and more wealth in harm’s way”.

The important question how we respond to that reality.

Things We Can Change

Clearly we need to examine the design of our building and transportation infrastructure, looking at potential changes in zoning, design standards, building codes, and insurance. Each of these has policy components at various levels of government. These policy questions also raise important questions of personal freedoms and responsibility: Do we allow people and organizations to make risky decisions about what they build where? And to what extent does the rest of society pay to help mitigate those risks?

While I, in no way, mean to minimize the suffering, health impacts and economic woes that resulted from damage to buildings and transportation, I would argue that the failure of our energy infrastructure lies at the heart of the dramatic breadth and depth of Sandy’s impact. We need to seriously question why so many people were without power, and why that power was (or still is, in many cases) out for so long.

In order to appreciate how different things could have turned out, imagine the ideal energy scenario where no one had lost power from Sandy. The number of people impacted by the storm drops dramatically, and our ability to clean up and recover from the other impacts is massively improved. This scenario isn’t total fantasy: Andy Revkin highlights successes amidst the chaos at NYU and Co-op City using natural gas co-generators which produced both heat and electricity for these facilities.

We need to use the tragedy of Sandy as motivation for a major program to disaster-proof our energy infrastructure. A logical approach would have local and national components.

Local Energy Resilience Program

Every city has its own energy reality. The in-place infrastructure, the mix of sources, the possible threats, and the possibilities for backup and alternative energy are all unique. As a result, each city needs to do its own assessment and improvement plan. That’s not to say that cities and regions don’t share challenges and can’t learn from each other, but in the end the possibilities and responsibilities have to be locally owned.

Each city needs to understand the vulnerabilities of its current infrastructure, and plot a strategy for addressing these vulnerabilities through smarter processes, use of technologies and contingency plans. To start, every municipality should have real, public targets and goals: If disaster X happens, how many lose power, and how soon is it back on. Same for disaster Y, disaster Z, etc.

These targets then become the basis for an improvement plan. How much better can we make things a year form now? Three years from now? Twenty years from now? Some elements of these plans will be public works projects, but others will be process improvements, new elements of contracts with private utility companies, etc.

It is important that these targets be public, and that they be publicly evaluated after each disaster. If a NYC public goal had stated that power would be restored in lower Manhattan after a direct hit from a tropical storm in up to two weeks, would the public stand for that? Might this have caused improvements to be put into place before Sandy?

Finally, there is an economic element of this effort that goes beyond insurance. Increasingly companies understand their reliance on energy irregardless of their product or service, and a reliable supply of clean energy at a reasonable cost is a serious component of business location decisions. For example, talk to anyone who plans data centers, and you’ll understand that energy resilience is not an abstract concept.

National Energy Resilience Program

While the detailed plans lie at a local level, the federal government has two important roles to play: addressing extra-municipal infrastructure, and ongoing research and advanced technology development.

While cites need to take the lead on their own resiliency plans, some issues are outside the realm of any specific locality, so need federal attention. For example, the interstate power grid is a critical resource, so naturally any issues there cascade down to the local level.

Like the local resiliency programs, the core activity is to understand failure modes, set public targets for downtime in different scenarios, and lay out a roadmap for improving those targets. In this case the targets are a vital input to the local resiliency plans. For example, if the target downtime for long-distance grid power is 6 hours, a city should be factoring this into its own targets, and how to improve them over time with backup power and other techniques.

In addition to extra-municipal resiliency, the federal government has a significant opportunity to provide new, improved options to localities by supporting research and advanced technology development in support of resiliency. For example, is NYU’s natural gas-powered backup system a model for other organizations? How can that system be made more cost-effective and reliable? Are there even better ideas?

While many of DOE’s charters involve vague goals, this has the potential to be a focused, mission-driven activity of the same nature as DOD’s increasingly successful energy investments. In any large organization, projects with clear, measurable goals will always make more headway in the long-run compared to projects with vague, high-level goals.

A Closing Thought

While I’ve discussed the need to address energy resiliency from a local and federal government perspective, Sandy should be a wakeup call for every organization and household. NYU didn’t rely on solely on NYC for its energy supply, and some are going a step farther: The New Republic calls for moving the entire grid to an Internet-style, distributed energy system.

Fortunately the planning blueprint I’ve laid out earlier applies equally well down to the individual household level. Evaluate your vulnerabilities, set targets, and layout a roadmap for improving them.

If individuals and private organizations of all types put their own plans into place, that will naturally put pressure on cities and the federal government to have clear, public targets and plans for improvement.

Specifying Open Climate Science: A First Attempt

In my last post, I used lessons from the open source software community and the Creative Commons effort explore what we mean by “open climate science”. In this post I’m going to take the next step and propose a specification for open climate science. Finally, in the next installments I will look at the how to implement this specification using our current intellectual property legal framework.

Before I dive in, it is worth reiterating that I am not a scientist (and, by logical extension, not a climate scientist). I have a lot of experience with open source through my job at Sun, and I believe that much it is applicable to this situation. However, I’m sure there are subtleties that I will miss in this effort. Hopefully, though, it is complete enough to help facilitate a broader discussion.

The first step is to define some terms that describe the process of climate science (I’m open to suggestions for these). I’ll use the following to describe the process itself:

Climate science consists of running a calculation across one or more data sets and producing a result data set. Scientific conclusions or observations are based on the result data set

Beyond that we will also need some terms to help us talk about the mechanics of doing the science as described above:

  • Some data sets are a raw, meaning that they are taken directly from human or machine based observations.
  • A calculation is done using an algorithm that is embodied as software.
  • Software can exist in source and binary form (generally, source code is the software humans write and can read, binary code has been compiled into 1’s and 0’s that a computer understands)
  • The term metadata will describe any additional information beyond the software and data sets which is required to understand or accurately recreate a result. This includes the schema of the data (e.g. units of the numbers), required software tools or libraries, including version numbers, models and calibration of sensors, etc.

The second step is be to identify the use cases that we’d like the concept of “open climate science” to support. While ‘use case’ may sound benign, they are in fact the most critical part of this discussion. If there isn’t agreement whether these use cases embody the philosophy of ‘open’ that we are looking for, then the rest of the discussion is pointless. So far I’ve been able to identify three main use cases:

  1. Allow others to reproduce and verify the result data set. An example might be a scientist or other interested party who looks at a result and says “That surprises me. I wonder if there’s something funny in the data, or if they made a mistake in the code?”

  2. Allow others to run variations on the calculation. For example, another scientist might want to run the a variation of the calculation on the same data sets in order to see what the impact is on the result.

  3. Allow others to combine some or all elements of the experiment with other elements and produce a different experiment. For example, another scientist may want to take a calculation which measures the correlation of two data sets and apply it to a third data set to see if the resulting correlation is similar.

Some may say that use cases #2 and #3 could be difficult to tell apart in some situations, and they would be correct. This is also a common issue in music and literature: as many artists and authors “sample” other people’s work, frequently the question arises whether the new work is just a variation on the earlier work, or is in fact a new work that is distinct of the earlier one. Since copyright law distinguishes between these two cases, understanding where that line is may be important in these situations. However, if we agree that both #2 and #3 are important aspects of open climate science, then defining the line between them becomes less important.

Its also important to note that we haven’t yet talked about avoiding undesirable use cases. For example, I’d rather not have someone take my results, write a paper about it and imply that they created the data themselves. Or I may be upset if someone puts my result data sets on a CD and sells them to others. These are important, and we will start the process of dealing with those below.

Finally, let’s take the use cases and definitions above and write a proposed specification of open climate science.

For the purpose of reproducing results, running variations of the calculations, and creating derived calculations, the science is open if:

  1. All input data sets are freely available and include all relevant metadata necessary to perform the desired calculations.

  2. All calculations are freely available in source code form and include all relevant metadata necessary to execute the code as-is or make reasonable modifications.

  3. The results data set(s) are freely available and include all relevant metadata.

  4. All necessary tools are either freely available or commercially available

For the purpose of this specification, freely available means that digital copies are available for download at no cost (copies on physical media may or may not be available, and may or may not be free). Commercially available means that it is available for sale, such as a specific computer or software package, but does not require that the cost be zero.

In order to protect the creators of the open science, we need to add some optional specifications which restrict the users. These can be used at the discretion of the creator.

  1. All data sets and software are available under these terms for non-commercial use only. These may or may not be available for commercial use, and other terms may apply.

  2. Any results published using the data sets and software under these terms must acknowledge the source of the assets which were used.

Another thing creators may want to do is control the permissions around derived works. We’ll look at that more closely in the legal discussion.

That completes the proposed specification. Its not very long, but does ask a lot of the open climate scientist.

Before we move on to the next installment, there is one possible change which would strengthen this spec further, and that is to require that all input data sets are either a) raw and freely available, or b) open, using the above definition. This means that no climate science is open until everything it depends upon is also open. In the software world this is true, but I will leave it as a question whether that makes sense in the case of climate science as well.

In the next installment we’ll use this specification and intellectual property law to create a license which can legally formalize this specification.

Towards Open Climate Science

The events that have transpired (physically) at University of East Anglia and (virtually) around the globe have raised the important question of whether climate science is open and transparent enough. This has led, naturally, for a call for “open source” science.

Personally, this discussion links two amateur passions of mine, climate science and open source. Coincidentally these are central themes of Greg Papadopoulos’ and my book, “Citizen Engineer”, not because we miraculously anticipated this particular point in time, but because we saw these as the two largest knowledge gaps in today’s engineers.

I could write a long article about why an open approach to climate science is the right thing. Even acknowledging short term issues that it will create, such as the one raised by Roger Pielke Jr in his response to Andy Revkin, I can argue strongly that its the right thing in the long run. So instead of adding to that discussion, I want to move on and talk about what happens next, and propose two activities that can lay the ground work for the future.

The next important step in this conversation may not be obvious, but it is to formally define what we mean by “open source science”. Its easy to say that the raw data and code for all peer reviewed work should be publicly available. But working day-in and day-out in the world of open source software, I know, firsthand, that reaching a clear, usable definition is far harder than you might believe.

First, there are a series of practical questions that need to be answered, such as how soon data and code needs to be available. Is the live stream from satellites available on the web? Is it OK to sit on the data for 6 months? How about waiting until papers using the data have been published? And as people start to the work with the data there will inevitably be demands for related data. For example, how much other data on the real-time operation of a satellite will people want in order to do their own calibration of a raw data product?

But beyond the practical issues there’s a more subtle question of licenses. Your average person on the street would assume that open source software is just plain freely available (i.e. in the public domain), but almost all software that is considered ‘open source’ comes with a license that seriously restricts how you can use it. For example, the license may dictate whether the code can be used in a commercial product, or whether the copyright holder will relieve the user of any patents that they may have that relate to the software. Trademarks and attribution may also play a role. To see some easy to understand license options, take a look at the excellent site CreativeCommons.org, which provides a tool for creating your own, custom license for your website, blog, music, etc.

As you can imagine, the wide array of possible licenses leads to a long, heated and contentious discussion over what is truly “open” and what isn’t. < src=”http://nearwalden.com/blog/wp-content/uploads/2009/12/OSI-logo-100×117.png” alt=”OSI-logo-100×117.png” border=”0″ width=”100″ height=”117″ align=”left” style=”margin: 10px”/> In the software world the Open Source Initiative is a non-profit that was formed for this purpose, and is generally recognized by the open source community as the standards-bearer of the definition. As you can see on their site, they also maintain a list of widely used licenses and how they stack up with the OSI standard.

One of the most intriguing aspects of the open source licensing world is a class of license which are referred to as “viral” or “reciprocal”. These licenses place requirements on derived works, often that the derived work is placed under the same license. The father of this type of license is the GNU Public License, or GPL. This clever license uses the US copyright system,not to prevent others from using a work, but instead to propagate free and open software. In other words, it says that you get the benefit of using this work, but in exchange you have to share your work which used this in the same way.

Its not hard to imagine using a GPL-like license in climate science. A data set could come with the requirement that results based on the data set also be freely available. Similarly, code used in an algorithm could have the same restriction (note that the algorithm is difficult to control, but code is covered under copyright law and can have an attached license). As you can see, this subtle idea could have broad, and lasting implications for the use of data and code in climate science.

So as you can see, the question of “what is open climate science” is less well defined than many would imagine, which leads me to two proposed actions.

First, is to find a formal home for the definition of “open climate science”. This important activity needs a home, just as open source software has OSI, which can manage the process of creating and maintaining the definition. This process will take some time, but if the climate science community is serious about transparency and openness, executing on this process will be required to making true progress. (Note: the Creative Commons project [Science Commons(http://sciencecommons.org/) may be useful here, but I don’t know much about it)

The second proposed action is simpler and can happen quickly. This activity is to publicly document (presumably on a website) basic facts about the ‘openness’ of the top sources of climate data and algorithm. Is the raw data available? Are the algorithms and code available? Who can have access? How do you get them?

While I’m not equipped to spearhead the first activity, I can certainly help get the second underway. Anyone else interested?

Temperature Data – “Worse than we thought”

The title may have led you to believe that the temperature is rising worse than expected, but the comment is about the data itself.

The various sets of temperature data that we have to do climate modeling are not very good, especially as you go back in time. This shouldn’t be surprising, when we’re trying to detect changes in temperature on the order of a few tenths of a degree C per decade, and the state of sensors and data collection networks prior to the PC and Internet eras.

What may surprise you is how much the data we have from the past continues to be massaged and manipulated. At Climate Audit, “Worse than We Thought” outlines the recent changes to one dataset.

While the changes are suspect, the lack of transparency about exactly how the data was manipulated is near criminal.