Linguistics society debates data sharing

Jeff Good, LSA Data Sharing Resolution, Cyberling Blog, January 11, 2010.

At the recently concluded Annual Meeting of the Linguistic Society of America (LSA) in Baltimore, the following resolution on Data Sharing was passed by those at the Business Meeting. It will soon be sent along to the whole membership of the Society for their vote. The resolution was put forth by the LSA’s Technology Advisory Committee.

Whereas modern computing technology has the potential of advancing linguistic science by enabling linguists to work with datasets at a scale previously unimaginable; and

Whereas this will only be possible if such data are made available and standards ensuring interoperability are followed; and

Whereas data collected, curated, and annotated by linguists forms the empirical base of our field; …

Therefore, be it resolved at the annual business meeting on 8 January 2010 that the Linguistic Society of America encourages members and other working linguists to:

  • make the full data sets behind publications available, subject to all relevant ethical and legal concerns; …
  • work towards assigning academic credit for the creation and maintenance of linguistic databases and computational tools; and
  • when serving as reviewers, expect full data sets to be published (again subject to legal and ethical considerations) and expect claims to be tested against relevant publicly available datasets.

The resolution passed in the Business Meeting by a comfortable enough margin that no vote count was required. …

After the resolution was presented at the Business Meeting, the LSA Ethics Committee decided it would discuss the resolution on its Ethics Discussion Blog in the near future, specifically to address what ethical issues it raises.

U.S. libraries call for public access

American Library Association and Association of College and Research Libraries response to the Office of Science and Technology Policy consultation on public access, January 12, 2010.

… The ALA and ACRL have long believed that ensuring public access to the fruits of federally funded research is a logical, feasible, and widely beneficial goal. …

All federal agencies funding significant research should adopt public access policies. This is important in a wide variety of disciplines, as new research in many fields can have an immediate impact on the public good. It is also necessary to establish consistent expectations and conditions for the management of grants and resulting output, saving institutions and principal investigators valuable time.

Based on the initial experience of low manuscript deposit rates under a voluntary NIH Public Access Policy, mandatory policies are necessary to ensure compliance and routine uptake of such submissions.

We urge a short embargo period and recommend a 6-month maximum to bring U.S. policy into alignment with policies already in place in Canada, the United Kingdom, and the European Union. …

The authorized repository should provide support for converting the file to a standard mark-up language, such as the currently preferred XML, if the file is not submitted in that format. PDF, a document format in ubiquitous use, does not support robust searching, linking, text-mining, or reformatting over the long-term, nor does it provide full accessibility for the blind and reading impaired. …

Also see the press release.

How to pass a campus OA policy

Gavin Baker, Open access: Advice on working with faculty senates, College & Research Libraries News, January 2010.

Tim Hackman’s October 2009 Scholarly Communication column, “What’s the opposite of a pyrrhic victory?,” discussed the failure of the University of Maryland to adopt an open access policy. Responding to the advice in Hackman’s piece, this column offers some suggestions on the process of proposing a policy at your institution. …

My overall advice: consider your endeavor a political one. Yours won’t involve street demonstrations or smoke-filled backrooms (probably), but it certainly will involve making friends and changing minds. Politics is not only about logic and reasoning, but also emotion and relationships. Be prepared for it.

One theme echoed by Hackman and others who have proposed open access policies is to not overestimate faculty’s understanding of open access. To the contrary, expect to spend considerable time and effort informing faculty and responding to their questions and concerns. …

Message control is key to any political endeavor. Formulating clear, succinct messages —and sticking to them—ensures that your most effective and favorable arguments will be communicated.

I’ve seen myriad different arguments for open access, some of them extraneous, confusing, or even antithetical to faculty interests. Be ever mindful of your audience. Speak their language and tailor your message to their concerns. …

Small or private informational meetings, proceeding at a deliberate pace, help to avoid triggering alarms or making anyone feel they have been left behind. …

As you proceed, be aware of the fault lines and diversity within your institution. The proposal shouldn’t come toward a vote with anyone feeling, “People like me weren’t consulted.” …

At all stages, exhibit confidence in your proposal. Without being untruthful, always focus on the positive aspects; let critics do their own work. But always be willing to hear concerns, and be patient in addressing them. …

Finally, one principle of politics is: never take a vote unless you know you will win it. If possible, do a “whip count” in advance to ensure your proposal has sufficient support to pass. Lobby waverers until they’re prepared to vote for the proposal, and delay a vote until then. …

Episode 50 – The Crystal Ball Returns

For our golden anniversary podcast, regulars Tom, Mills, and Dan look into their crystal ball to see what the future holds for 2010 and the coming decade. We also look back at the biggest stories of 2009 and the prior decade. Via Twitter, we also share prognostications from our very smart audience.

Running time: 1:01:18
Download the .mp3

KWIC Modifications

I have been working on getting a cleaner output format from KWIC for the Greek texts on Perseus. Helma was desirous of the KWIC output leaving in the word tags which occur in the Perseus texts in order that the word lookup function be usable directly from the KWIC results page. Since KWIC leaves as little formatting as possible, it strips out all tags, including the word tags, from the text. While I worked on that, I also added a few other modifications to KWIC to give a better look to the results page for the Greek texts.

The problem with KWIC for Greek texts is that Greek fonts to not support single-width fonts, which KWIC uses to align the results more cleanly. In addition, the title lines, which give the bibliographic information and link, can be different lengths and this also causes problems aligning the search terms. See for instance, this search.

To solve the word tag problem I just made a few modifications to the KwicFormat subroutine in philosubs. The main edit there was changing the line that stripped all tags into this:

$bf=~ s#< (?!(w |/w))[^>]*># #gi; #keep only word tags

For the alignment issue, things were more complicated. Keeping track of the length of the left side of the line doesn't allow for a consistent place on the page due to the differing widths of the letters. In the end, I modified artfl_kwic to chop the left side of the hit to a size as close to a certain length as possible without breaking any words. Previously, both the right and the left were chopped to a certain length regardless of breaking words and length including tags, often resulting in very little content. Now, only the length of the display string is accounted for and in addition the length is adjusted for the length of the bibliographic title.

I also added a span around the left and right sides of the hit to allow for positioning and alignment using Javascript (and CSS). Then, adding the following lines to the Results Header, the search terms are all lined up in a neat line:
span.left { right:46%; position:absolute; }
span.right { left:54.5%; position:absolute;
height:18px; overflow:hidden;}
The numbers may look a little messy, but they give nice results. I found that without the decimal, the two sides were a bit too far apart, but there may be another way around that.

The extra bits for formatting the right span are in place of trimming the content of the right side in perl as I did for the left side. I found that the overflow:hidden attribute is quite handy if you can get it to work (it is a bit tempermental). As long as it is found in an absolutely positioned object with restricted size, AND it is contained within an element with overflow set to auto, it should work. It simply hides any content that does not fit within the given boundaries. It gives a very clean look to the right side of the page and even adjusts to different window sides so that the content never leaks to the next line.

Take a look at this page and play with the window size to see what I mean. Unfortunately, there is no such nice property for trimming the overflow off of the left side instead of the right side. That is why I did it in perl instead of Javascript. There is a function called clip in javascript which is designed to clip an image, but again the way it works makes it much easier to trim from the right side than the left side. One could probably twist the clip function enough to make something similar happen for the left side (and make everything nicely adjustable and lovely), but for now, it is happening in perl. (I tried for a while, but my concoctions just seemed to slow things down and not add anything exciting results).

UPDATE: I couldn't resist playing a bit more with the javascript, and now it works like I wanted it to! Now, if you click on the link above, it won't illustrate what I said it would, because the javascript has been improved. I added this function:

function trimKwicLines(){
var contentwidth = $(".content").width();
$(".left").each(function (i) {
var width = $(this).width();
var leftoffset = contentwidth*.4 - width*1 - 2;
$(this).css("left", leftoffset);

And changed things here and there in the CSS.

Melissa Terras’ Blog 2010-01-15 00:06:00

A belated Happy New Year, everyone. Like most the UK, we've had our fair share of #uksnow, and I've used the time wisely, holed up in my shed finishing up the camera ready copy for the DHQ volume on cyberinfrastructure and classics, in memoriam to Ross Scaife, which is going to be printed up by Gorgias Press. But thats not what I want to talk about here.

I made it into London yesterday in the constant snow: here is one of the Lions in front of the British Museum, looking a bit chilly. I had a meeting there as part of the Linksphere project, we hope to do some collaborative research with them on various things (early days, few details to share at yet, but exciting possibilities and a great meeting). The interesting thing I want to highlight was how this came about. We are only 10 mins walk up the road from the BM, but its often hard to meet likeminded people in other organisations. So how did we get together? Because someone important at the BM saw some of @Clairey_ross's twitter posts, about the research we are doing. And lo, social media does lead to some new research possibilities.

Other things that are happening. I'm writing a book chapter about truth and representation in digital images, and particularly digitised images of text, and the implications that this has for manuscript based scholarship. How did this come about? A silly game on Facebook, which looks for the longest word you can make from letters in your name. Mine is materialisms, apparently. And a friend joked that that was ironic, given that my research primarily exists in the digital, rather than material world. To which I said that I was editing a book on Digitizing Material Culture - and forementioned friend, who is an expert in classical art, particularly theories of representation, said "have you ever thought of applying *that* to *this*? Which of course I hadnt. And now I shall. See, Facebook was useful for my work, after all.

Its these connections, and happenstances, which are perhaps the most useful? amusing? thing for me about social media. Sure, there is all the web 2.0 stuff - such as DHNow, - but I'm enjoying the unexpected.

In the olden days (ie ten years ago), I used to enjoy looking at the books next to the books I needed in the Library. They invariably had the thing I was really looking for, or sent you off on another, random, unexpected trail. I'm glad that social media, for me, is starting to allow the same things to happen.

If you are not doing anything next week, and happen to be around Oxford, I'm helping chair an eScience Institute workshop on users, research, and web 2.0. Plenty of interesting speakers about interesting research in the area. There are still spaces, and its free: do come along.

sticks and stones

Image source: The Guardian

Malaysia is in the news again for the usual reasons: nine churches firebombed by radical Muslims in the last week, on account of a rising controversy over the use of the word ‘Allah’ by Christians. A Catholic weekly paper, The Catholic Herald, was ordered by the government to cease publishing its Malay-language edition until the courts resolved the question of its use of the word ‘Allah’ to mean the God of the Christian faith. The question was resolved in High Court, which ruled in favour of the Herald. A week after the ruling, nine churches were torched over three days. Molotov cocktails were involved. The High Court ruling has been suspended pending appeal.

This semantic quibble can seem baffling to non-Malaysians, but the sad truth is that the event is wholly explicable within the context of Malaysian social dynamics. It seems to me that the trouble arises out of a potent (Molotov) cocktail of two factors: 1) the troubling relationship that exists between ‘Malay’ and ‘Muslim’ in Malaysia, and 2) the relationship that this hybrid ‘Malay-Muslim’ has with the rest of Malaysian society. First, some thoughts on the word Allah; then, on the Molotov cocktail of Malaysian society.

In the Beginning was the Word

First of all, it should be made clear that there are at least two words for ‘God’ in Malay (apart from, you know, the 99 names of God): Allah, and Tuhan. The first is from Arabic: a Semitic word for the divine, combining the definite article al- (the one) with the root word -ilah, meaning ‘god’. The root ilah can be compared with the Northwest Semitic el of Elohim, the Hebrew word for divinity.

The second and probably older word in the region, Tuhan, seems to share a common etymology with the Austronesian word atua, or te atua in Maori, meaning god (no particular one – the Maori were broadly animist). The link isn’t surprising. Malay is a member of the Malayo-Polynesian language tree, and many other linguistic commonalities run throughout the region: Indonesian, Micronesian, Polynesian and Philippine languages are all relatively closely related. Tuhan probably also has something in common with the Malay word tuan, which roughly means lord or master (as in, Joseph Conrad’s Tuan Jim, or Lord Jim).

Both words have been in use in Malay, more or less interchangeably, throughout its written history. Even on the famed Terengganu Inscription Stone (Batu Bersurat Terengganu), which is the earliest extant evidence of Islam on the Malay peninsula and dates to around 1303, the word ‘Allah’ appears three times, and the word ‘Tuhan’ twice.

Malay, Muslim, Same Diff

What has animated the whole controversy is the claim by the ruling government that the word ‘Allah’ is something especially Islamic, and by extension, exclusively Malay. The trouble comes at “by extension”. Under the Federal Constitution, a Malay is defined as a person who 1) is born to a Malaysian citizen, 2) professes to be Muslim, 3) speaks the Malay language, 4) adheres to Malay custom, and 5) is domiciled in Malaysia. Here’s the important part: Malay citizens who convert out of Islam are no longer considered constitutionally Malay, even if they were born to a Malaysian, speak Malay, adhere to Malay custom and live in Malaysia.

This bizarre, calcified mess of legal status, religion, language and ‘custom’ is almost entirely a product of colonial governance. The definition comes from a Land Reservation Act from 1913, which the British passed in an attempt to delineate which people should benefit from state land protectionism. But over time, the definition proved both politically expedient (the British gained a lot of colonial mileage out of professing to be looking after ‘the Malays’) and psychologically central to Malay self-perception (‘the Malays’ came to see themselves as a coherent cultural entity). The result is that today this definition is no longer only politically useful; it has become true for many Malays, and it is what their sense of identity rests on. And that identity is geographic, linguistic, cultural, and yes, religious.

Malay-Muslims vs. the World

The second factor is the relationship of Malay-Muslims to the rest of Malaysian society. One might ask: why is Islam such an important element of Malay identity, given the other four constitutional components?

The answer here, I feel, is demographic, and one can see that by comparing Malaysia with Indonesia. Indonesia has always been much more racially confident of itself than Malaysia, even though at times this confidence has led to terrible tragedies. Across Indonesia, visible ethnic minorities remain minorities, and are today often deeply assimilated. The Chinese population, to name the usual suspect, is small — only about 3-5% of the population. And Indonesian Christians, I’ll add, have no trouble using the word ‘Allah’:

Image source: Malaysia Today

In contrast, Malaysia is a much more heterogenous society, with ‘Malays’ making up around 60% of the population, ethnic Chinese somewhere around 25 to 30% and ethnic Indians, mostly Tamils but also some Sikhs, around 8%. This has led to a certain amount of racial insecurity. Over the decades of last century, various Malay groups have made various sorts of overtures to Indonesia in order to try to tap into the demographic advantages of Indonesia’s enormous, ethnically Malay population. The ejection of Singapore from Malaysia in 1965, it has been argued, had in part to do with fears over the large number of ethnic Chinese in Singapore that would become part of Malaysia, while the inclusion of British Borneo into the Federation in 1963 was arguably an attempt to beef up the ‘ethnic Malay’ numbers. In the 1940s, a group of Malay nationalists almost managed to negotiate Malaysia into sharing Indonesia’s independence when the Japanese Occupation ended: a thwarted vision of “a nation that would consolidate a hundred million brown peoples into a single Republic of Malaysia” (Taufik Abdullah, 1997:257).

But the fact remains that Malaysia has always been more interested in hooking up with Indonesia (and other regional Malays) than vice versa. Today, Malay Malaysians are on their own, in a very multicultural Malaysia. The proximity of multiculturalism, I think, has created a lot of incentive for Malays to differentiate themselves, and to hang on tightly to those differences. And in Malaysia, of the five constitutional elements of ‘Malayness’ I listed above from the 1913 definition, only two remain which are not now widely shared by all Malaysian citizens since Malaysian independence in 1957: Malay ‘custom’, and Islam.

Religion has therefore become a central marker of ethnic identity in Malaysia. And here is the nub of the problem. In the case of Islam, a religion that has historically spread with its carrier language, the Arabic language comes with the territory. It’s not so much that many Malays speak Arabic (in fact, I don’t know that many do), but rather that any connection to the Arabic culture and language should be, in Malaysia, only effected through Islam — which is in turn almost exclusively Malay. Two examples of this perceived special connection:

  1. In the forties and fifties vociferous battles were fought over whether the Malay language should be written in Jawi (Arabic script) as it had been since Islam came to the region, or in Rumi, the Latin script. Opponents of romanization accused pro-Rumi writers of being kaffirs (infidels), saying that discarding the Arabic script was tantamount to discarding Islam. The Arabic script was central to Malayness.
  2. Today, the word kitab, which means a normal ‘book’ in Arabic, is often used in Malay to refer to a specifically religious book, while secular books are simply buku, from the English.

Something similar is happening here with Allah. It may ‘just’ mean ‘God’ in Arabic, but in Malaysia, amidst these deep-rooted anxieties and questions of identity, it is so much more than a semantic quibble.

Allah, Tuhan, Same Diff

The frankly ridiculous claim, then, that only Malays/Muslims are allowed to use the word Allah, and that everyone else should back off and use Tuhan, certainly arises out of this ingrained defensiveness over what it is to be Malay. The claim that ‘Allah’ is somehow especially Islamic is disproved at least by the fact that the word itself predates Islam. Any argument that it has acquired ‘Islamness’ over time is furthermore disproved by the fact that it remains in use by Arab Christians today (and also by Indonesian Catholics). The dogged adherence to this claim by a small number of firebomb-wielding ultra-Malays is only explicable when we understand how sensitively most Malays are invested in themselves (politically or otherwise) as Muslims, in distinction to the other ethnic groups and religions in Malaysia. To be flippant: if Malays were really interested in being more ‘Malay’, they should in fact use the word Tuhan, which is much more ‘Malay’ for having deeper regional Austronesian roots, than Allah, which is, after all, an imported name for an imported God.

IMLS Native American/Native Hawaiian Museum Services CFDA 45.308

The Native American/Native Hawaiian Museum Services program promotes enhanced learning and innovation within museums and museum related organizations, such as cultural centers. The program provides opportunities for Native American tribes and Native Hawaiian organizations to sustain heritage, culture, and knowledge by strengthened museum services in the following areas:Programming: Services and activities that support the educational mission of museums and museum-related organizations.Professional development: Education or training that builds skills, knowledge, or other professional capacity for persons who provide of manage museum service activities. Individuals may be paid or volunteers.Enhancement of museum services: Support for activities that enable and improve museum services. Applications are due by April 1, 2010.

Eligible applicants are Indian tribes or organizations that primarily serve and represent Native Hawaiians For the purpose of funding under this program, “Indian tribe” means any tribe, band, nation, or other organized group or community, including any Alaska native village, regional corporation, or village corporation (as defined in or established pursuant to the Alaska Native Claims Settlement Act (43 U.S.C. Section 1601 et seq.)) which is recognized by the Secretary of the Interior as eligible for special programs and services provided by the United States to Indians because of their status as Indians.
A list of eligible entities is available from the Bureau of Indian Affairs, except for the recognized Alaska native villages, regional corporations, and village corporations (Alaskan entities should refer to applicable provisions in the Alaska Native Claims Settlement Act, referenced above). The same population cannot be served by more than one grant. For the purposes of funding under this program, “organizations that primarily serve and represent Native Hawaiians” means any nonprofit organization that primarily serves and represents Native Hawaiians, as the term is defined in 20 U.S.C. Section 7517, is also eligible for funding. The term “Native Hawaiian” means (a) any individual who is a citizen of the United States, and (b) a descendant of the aboriginal people who, prior to 1778, occupied and exercised sovereignty in the area that now comprises the state of Hawaii, as evidenced by genealogical records; Kapuna (elders) or Kamaaina (long term community residents) verification; or certified birth records. IMLS recognizes the potential for valuable contributions to the overall goals of the Native American/Native Hawaiian Museum Services program by entities that do not meet the eligibility requirements above. Although such entities may not serve as the official applicants, they are encouraged to participate in projects as partners. Federally operated libraries and museums may not apply for the Native American/Native Hawaiian Museum Services grants, but they may serve as nonessential partners to applicants if they do not receive IMLS grant funds as a result of the project. Contact IMLS before submitting a proposal involving a federal agency or federal collection. Consult with IMLS about any eligibility questions before submitting an application.

Document Type: Grants Notice
Funding Opportunity Number: NANH-FY10
Opportunity Category: Discretionary
Posted Date: Jan 11, 2010
Creation Date: Jan 11, 2010
Original Closing Date for Applications: Apr 01, 2010
Current Closing Date for Applications: Apr 01, 2010
Archive Date: May 01, 2010
Funding Instrument Type: Grant

Category of Funding Activity: Arts (see "Cultural Affairs" in CFDA)
Humanities (see "Cultural Affairs" in CFDA)

Category Explanation:
Expected Number of Awards: 25
Estimated Total Program Funding:
Award Ceiling: $50,000
Award Floor: $5,000
CFDA Number(s): 45.308 -- Native American/Native Hawaiian Museum Services Program
Cost Sharing or Matching Requirement: No

NEA Challenge America Fast-Track, FY 2011 CFDA 45.024

The Challenge America Fast-Track category offers support primarily to small and mid-sized organizations for projects that extend the reach of the arts to underserved populations -- those whose opportunities to experience the arts are limited by geography, ethnicity, economics, or disability. Applications are due by May 27, 2010.

Age alone (e.g., youth, seniors) does not qualify a group as underserved; at least one of the underserved characteristics noted here also must be present. This category, as an essential component of the Arts Endowment's goal of providing wide access to artistic excellence, supports local projects that can have significant effects within communities. Grants are available for professional arts programming and for projects that emphasize the potential of the arts in community development.

Partnerships can be valuable to the success of these projects. While not required, applicants are encouraged to consider partnerships among organizations, both in and outside of the arts, as appropriate to their project.

These Fast-Track grants: Extend the reach of the arts to underserved populations. Are limited to the specific types of projects outlined below. Are for $10,000 each. Receive an expedited application review. Organizations are notified whether they have been recommended for a grant approximately six months after they apply; projects may start shortly thereafter.

NOTE: A policy will be implemented in the coming year to limit consecutive year funding. This policy will ensure that Challenge America Fast-Track funding reaches new organizations and their communities of underserved populations with limited access to the arts. Starting with grants that are awarded in FY 2011 (that result from applications received under this year’s May 27, 2010 deadline), an organization that receives Challenge America Fast-Track grants for three years in a row will not be eligible to apply to the Fast-Track category for the following one-year period. For example, if an organization receives grants in FY 2009, 2010, and 2011, it may not apply again in FY 2012. During FY 2012, the organization may apply to other Arts Endowment funding opportunities including Access to Artistic Excellence and Learning in the Arts for Children and Youth. The organization would be able to apply to the Challenge America Fast-Track category in FY 2013.

An Organization may submit only one application through one of the following FY 2011 Grants for Arts Projects categories: Access to Artistic Excellence, Challenge America Fast-Track, Learning in the Arts for Children and Youth. The Arts Endowment's support of a project may start on or after January 1, 2011.

Document Type: Grants Notice
Funding Opportunity Number: 2010NEA01CAFT
Opportunity Category: Discretionary
Posted Date: Jan 12, 2010
Creation Date: Jan 12, 2010
Original Closing Date for Applications: May 27, 2010 May 27, 2010, Application Deadline January 1, 2011, Earliest Beginning Date for Arts Endowment Period of Support
Current Closing Date for Applications: May 27, 2010 May 27, 2010, Application Deadline January 1, 2011, Earliest Beginning Date for Arts Endowment Period of Support
Archive Date: Jun 26, 2010
Funding Instrument Type: Grant

Category of Funding Activity: Arts (see "Cultural Affairs" in CFDA)

Category Explanation:
Expected Number of Awards:
Estimated Total Program Funding:
Award Ceiling: $10,000
Award Floor: $10,000
CFDA Number(s): 45.024 -- Promotion of the Arts_Grants to Organizations and Individuals
Cost Sharing or Matching Requirement: Yes

Eligible Applicants:
State governments
County governments
City or township governments
Special district governments
Independent school districts
Public and State controlled institutions of higher education
Native American tribal governments (Federally recognized)
Nonprofits having a 501(c)(3) status with the IRS, other than institutions of higher education
Private institutions of higher education

Additional Information on Eligibility:
APPLICANT ELIGIBILITY Nonprofit, tax-exempt 501(c)(3), U.S. organizations; units of state or local government; or federally recognized tribal communities or tribes may apply. Applicants may be arts organizations, local arts agencies, arts service organizations, local education agencies (school districts), and other organizations that can help advance the goals of the Arts Endowment. To be eligible, the applicant organization must: Meet the Arts Endowment's "Legal Requirements" including nonprofit, tax-exempt status at the time of application. (All organizations must apply directly on their own behalf. Applications through a fiscal agent are not allowed.) Have a three-year history of programming prior to the application deadline. Have submitted acceptable Final Report packages by the due date(s) for all Arts Endowment grant(s) previously received.

NEA Web Site Complete Announcement