OPEN GOVERNMENT -- Are metadata—the subsurface but accessible traces as to the sources, dissemination, editing and other textual history of a Microsoft Word document—part of the public record that must be disclosed under the California Public Records Act?  A Deputy San Francisco City Attorney has been saying No—release Word documents only as inert pdfs. But now Arizona's Supreme Court, for what it's worth, has ruled to the contrary.


UPDATE 2, 11/1/2009:  Richard Knee, Chair of the San Francisco Sunshine Ordinance Task Force, adds:

Paul Zarefsky's 2007 paper left out some important information. San Francisco's Sunshine Ordinance (http://sfgov.org/sunshine), Sec. 67.24(a)(1), declares that except in certain narrowly defined cases, "no preliminary draft or department memorandum, whether in printed or electronic form, shall be exempt from disclosure under Government Code Section 6254, subdivision (a) or any other provision. If such a document is not normally kept on file and would otherwise be disposed of, its factual content is not exempt under subdivision (a). Only the recommendation of the author may, in such circumstances, be withheld as exempt."

The city's Sunshine Ordinance Task Force has consistently found in favor of complainants who have requested electronic documents in their original format. Besides pointing to 67.24(a)(1), the task force has held that metada are disclosable public information, and that when a body of metadata contains information that is barred or exempt from disclosure, the non-disclosable portion must be segregated/redacted and the disclosable portion must be provided.

The bad news is that the task force has no power to penalize sunshine scofflaws in city government; it must instead refer cases of willful violation to the city's Ethics Commission, the Board of Supervisors, the district attorney or the state attorney general. And those entities have invariably dismissed task force-referred cases because respondents have said they were acting on advice from the city attorney's office and were thus not willfully violating the ordinance in withholding requested information.

The task force is trying to remedy this and other loopholes in the ordinance through a package of amendments that it hopes to get on the local ballot in June or November 2010.

UPDATE 10/31/2009:  Hat tip to Kimo Crossman and David Akin for pointing up the following example of what metadata can reveal.

Transparency advocates are excited about the ruling,
because—among other things—metadata has been useful in revealing
the influence of lobbyists and other special interests on the
legislative process: 

One of the most famous metadata lobbying goof-ups occurred in
2004, when Wired
busted
California Attorney General Bill Lockyer circulating
an anti-P2P [peer-to-peer filesharing] letter that, after a
look at its Word metadata, appeared to have been either drafted
or edited by the [Motion Picture Association of America].


Here's how Deputy City Attorney Paul Zarefsky described the problem for government agencies in a paper presented at a League of California Cities conference in 2007.

A Word document, unlike a paper record or an electronic record in PDF form, contains "metadata" – information about the record that does not appear in the text but is automatically generated by the software program when a text is created, viewed, copied, edited, printed, stored, or transmitted using a computer.  Metadata are typically embedded in the record in a manner not readily viewed – and often not understood – by persons without specialized computer training.  Indeed, many persons who use a computer are unaware that Word documents contain metadata or that electronic transmission of a Word document in that form includes not merely the visible text but also the metadata.  This paper uses the term "metadata" broadly to include any information embedded in an electronic record that is not visible in the text.

Metadata in a city document may include a wide variety of information that the city has a right – and, in some cases, a duty – to withhold from public view.  For example, earlier versions of the electronic record are typically present in metadata.  Often they include recommendations, suggestions, "trial balloons," and tentative ideas that have not been thought through; suggested edits, comments, and criticisms from colleagues and supervisors; and words, phrases, sentences, and even entire passages that solely reflect the author's thought process because they were deleted early on by the author and thus never communicated to another person.  Thus, much information in metadata may arguably be withheld from disclosure under Section 6254(a) as preliminary drafts or notes, or under Section 6255(a) and the deliberative process privilege.

A second, different type of example arises from the law's special concern for privacy rights.  (Sections 6250, 6254(c); Cal. Const., Art. I, sec. 1.)  Earlier versions of an electronic record that are present in metadata may include information the disclosure of which would violate a third party's privacy.  A wide range of types of information may be encompassed within the right of privacy, from residential phone numbers and Social Security numbers to sensitive medical, financial, and sexual data to information provided by, and the identity of, whistleblowers.  When a record is finalized and put forward for public view, it hopefully will have been sufficiently reviewed to be sanitized of private information.  But earlier versions of the record may not have been crafted with the same sensitivity to privacy that will have gone into the final document.

As a third example, metadata may include confidential communications between attorney and client that do not appear in the text of the record.  The law protects such communications from disclosure and imposes on attorneys a duty not to disclose such information.  (Cal. Evid. Code §954; Cal. Bus. and Prof. Code §6068(e).)  Thus, in response to a public records request of a city attorney's office, in many instances an attorney would be duty-bound to search the metadata in electronic records as to which he or she may have had input, even if the record on its face does not reveal an attorney-client communication.  And if a city department in custody of an electronic record that had been developed collaboratively with its attorney disclosed the record as a Word document without checking the metadata embedded in the record, it would run the risk of inadvertently disclosing confidential attorney-client communications.

These examples – and more could be cited – merely illustrate the point that metadata may contain information that is subject to redaction under the Act.  If a city were to give a requester a document in Word form, the city would be required to review the metadata embedded in the document – or avoid doing so at its peril, because failure to conduct this review would risk disclosure of privileged material.  Yet reviewing the metadata would be a laborious and problematic task – different in nature and magnitude from the process of reviewing the text to determine information that should be redacted and information that is reasonably segregable from that which should be redacted.

While the paper cautions that the author's views are his own and not those of the City Attorney's Office, that office's advice to city agencies has been to convert Word documents to pdfs before release to the public.

Meanwhile, as noted yesterday by Ansley Schrimpf for the Reporter's Committee for Freedom of the Press,

The Arizona Supreme Court today ruled
that metadata – information about the history, tracking and management
of an electronic document – is subject to the state’s public records
law.

Several national media organizations supported Phoenix police
officer David Lake’s challenge that the city improperly denied his 2006
public records request for the metadata about documents he had
previously requested and received. The city refused Lake’s request,
arguing the metadata did not fall within the state’s definition of
public records, which a court established in 1952, long before the
advent of electronic documents.


In a unanimous opinion released today, the state’s high court held,
“If a public entity maintains a public record in an electronic format,
then the electronic version, including any embedded metadata, is
subject to disclosure.”


David J. Bodney, a lawyer who helped write a friend-of-the-court
brief on behalf of The Associated Press, Gannett Co., The E.W. Scripps
Company, and The Reporters Committee for Freedom of the Press, said the
state Supreme Court decision is a victory for public access.

“The decision is important because we live in an electronic age
where maintenance and preservation of public records in electronic
format is quickly becoming the norm,” Bodney said. “Public bodies
should not be permitted to withhold that information from public
inspection.”

The case – Lake v. City of Phoenix – began after Lake filed
an administrative complaint and federal lawsuit alleging employment
discrimination. He also filed a records request for a supervisor’s
notes, which he received, read and suspected had been backdated. So he
requested the documents’ metadata, which can include information about
a file’s creation, edit dates and authorship. After the city denied his
request, Lake sued. Two lower courts sided with the city, including a
division of the state’s appellate court, which ruled 2-to-1 against the
metadata’s release in January.

The court stated:

The metadata in an electronic document is part of the underlying document; it does not stand on its own.  When a public officer uses a computer to make a public record, the bmetadata forms part of the document as much as the words on the page. . . Arizona’s public records law requires that the requestor be allowed to review a copy of the “real record.” . .  It would be illogical, and contrary to the policy of openness underlying the public records laws, to conclude that public entities can withhold information embedded in an electronic document, such as the date of creation, while they would be required to produce the same information if it were written manually on a paper public record.

We accordingly hold that when a public entity maintains a public record in an electronic format, the electronic version of the record, including any embedded metadata, is subject to disclosure under our public records law.  

Our decision is unlikely to result in the “administrative nightmare” that the City envisions.  A public entity is not required to spend “countless hours” identifying metadata; instead, it can satisfy a public records request merely by providing the requestor with a copy of the record in its native format.  Additionally, not every public records request will require disclosure of the native file. 

Public entities may provide paper copies if the nature of the request precludes any need for the electronic version.  Public records requests that are unduly burdensome or harassing can be addressed under existing law, which recognizes that disclosure may be refused based on concerns of privacy, confidentiality, or the best interests of the state.  . . . (balancing interests to determine if the state’s privacy or confidentiality concerns outweigh the presumption of disclosure).