Open Side Menu Go to the Top
Register
Data, Evidence, and Ideas: The Philosophy of Social Science Data, Evidence, and Ideas: The Philosophy of Social Science

04-13-2019 , 02:47 PM
I expect that I'm unable to provide a completely convincing argument to counter the claim that social science isn't really science, but I want to talk about it a bit anyway. Or, more accurately, I want to quote Howard Becker talking about it.

But, before that, I want to note that I think sometimes the topic gets muddled because it's not very clear to people what the boundaries of "social science" are, especially in relation to the Humanities. This confusion probably owes in large part to the "social sciences" and "humanities" influencing each other, whether in the form of postmodernist or feminist philosophy informing debates about positivism within social sciences, or the proliferation of various "cultural studies" (gender studies, African American studies, etc.) programs in the humanities.

I would not argue that literally every publication or work dealing with society and culture is scientific, although I would also not argue that all of it needs to be in order to be worthwhile. Instead, my more modest goal is to try outline some general thoughts on how at least some work in the social sciences can be said to be scientific, with much the same commitments to scientific epistemology as other sciences, although the social sciences also face some unique challenges.

With that in mind, the title of this thread draws on the 2017 book Evidence, by the sociologist Howard Becker, who I recommend very highly. This book is in large part a critique of flaws common to sociological research. But it also provides a useful philosophical outline of the social sciences qua science, and I hope thinking about the relationship between these three concepts of Data, Evidence, and Ideas will be useful. Here's how Becker introduces the subject:

Quote:
In the early 1960s, Paul Wallin and Leslie C. Waldo, two Stanford sociologists, wanted to learn how social class affected children's school performance (a question that still concerns social scientists). They administered a questionnaire to 2,002 eight-grade boys and girls. To measure social class, they asked the children to answer a question from August Hollingshead's then well known and often used Index of Social Position:
[Describe] your real father, if you live with him. If you are not living your real fater answer... about the man you live with who is supposed to be taking his place...

Most of the time does he work for himself or for sombebody else?
____ he works for himself or has his own business
____ he works for somebody else
____ I don't know what he does
What is his work or job most of the time?
He _________________________. (Wallin and Waldo 1964, 291)
[Wallin and Waldo] probably thought a father's occupation would serve as a substantial clue to (if not a definitive measurement of) social class, a combination of the economic and social realities of the parents' way of live and the lives their children might have. They thought this report of the work the father did, this single fact, would give them an indirect way to guess at the family's income and wealth, and thus an inexact, perhaps not explicitly formulated but nevertheless not meaningless, measure of the parents' hopes for their children's education. (pp. 1-4)
Becker goes on to detail the difficulties Wallin and Waldo ran into when they tried to use the data gathered via this questionnaire, largely tied to an inability to classify nearly 25% of respondents. The example is more broadly useful, though, because it concisely demonstrates so many of the methodological challenges of social science, for example in the selection of methods intended to operationalize complex concepts like "social class" in order to measure them, so as to be able to draw valid inferences from those measurements, to support various theories.

Quote:
Data, Evidence, and Ideas

The things social scientists observe, however they observe them, and then record more permanently in writing, visual images, or audio recordings -- the material they work with -- consist of observable physical objects: marks produced by machines; marks produced by people who check a box on a questionnaire or write something a sociologist or historian might read or use; marks social scientists make when they write down what they've seen or heard; marks produced by people who record their own behavior as part of the work they do (as police officers record the names of people they arrest and the offense they charge them with); marks produced by employees or volunteers who collaborate with social scientists to record what the people they want to learn about tell them or do in their presence. These recorded traces serve as data, the raw material researchers make science from.

These data... become evidence when scientists use them to support an argument: good evidence when the audience accepts the items as valid statements about what happened when someone gathered the original data. We base a statement about a person's age on the proof provided by a recorded answer to a question someone asks them, on paper or in person, or on information someone copies from an official record... -- all these kinds of data usually attest well enough to the reliability and truth of the answer that people accept the argument we offer it as support for. "Yes, she really is 22 years old"; her birth certificate proves it as well as any reasonable person could want it proved. And that makes it evidence, data supporting a statement that goes beyond what can be seen on the paper to a reality, an accepted fact.

The data-turned-into-evidence support a statement about a particular example of some general idea we want other people to believe or at least accept for now. For scientists, the idea usually belongs to a more general system of ideas or concepts that we call a "theory".

Data, evidence, and ideas make a circle of interdependencies. The data interest us because they help us make an argument about something in the world that they would be consequential for. Expecting that others may not accept our argument, we collect information we expect will convince them that no one could have recorded reality in that form if our argument wasn't correct. And the idea we want to advance leads us to search for kinds of data, things we can observe and record, that will do the work of convincing others for us. The usefulness of each of the three components depends on how it connects to the other two.
Broadly then, my argument is that "social science" is in fact scientific to the extent that researchers employ methods to ensure the reasonable validity and reliability of the data they collect, in relation to what ideas they wish that data to serve as evidence for. So, in the example of Wallin and Waldo, the problem (and it's a common one) they ran into was a disconnect between their data and the idea of social class. Problems with the data made it unsuitable as evidence of social class. Most of social science methodology is about overcoming similar problems, by standardizing approaches known to work well for producing data, e.g. in the use of surveys, structured interviews, content analysis, and so on.

When Becker writes that we wish to make an argument about our data that "no one could have recorded reality in that form if our argument wasn't correct", he hits on probably the most important objection, again evident in the example of social class: it is too complex of a concept to admit of any simple measure. It's not like trying to measure the mass of an object. Social science data are rarely so conclusive as to admit of only a single explanation. But, I would argue that even if social sciences must sometimes embrace more modest goals than some physical sciences, they are nonetheless science. Here Becker cites the work of the mathematician George Polya:

Quote:
Polya on Plausibility as an Appropriate Goal for Empirical Science

Strictly speaking, all our knowledge outside mathematics and demonstrative logic consists of conjectures. There are, of course, conjectures and conjectures. There are highly respectable and reliable conjectures as those expressed in certain general laws of physical science. There are other conjectures, neither reliable or respectable, some of which make you angry when you read them in a newspaper. In between, there are all sorts of conjectures, hunches, and guesses.

We secure our mathematical knowledge by demonstrative reasoning, but we support our conjectures by plausible reasoning. A mathematical proof is demonstrative reasoning, but the inductive evidence of the physicist, the circumstantial evidence of the lawyer, the documentary evidence of the historian, and the statistical evidence of the economist belong to plausible reasoning.

The difference between the two kinds of reasoning is great and manifold. Demonstrative reasoning is safe, beyond controversy, and final. Plausible reasoning is hazardous, controversial, and provisional. Demonstrative reasoning penetrates the sciences just as far as mathematics does, but it is in itself incapable of yielding essentially new knowledge about the world around us. Anything new we learn about the world involves plausible reasoning.
Essentially, then, the point here is that there is not a difference in kind between so-called "hard" and "soft" sciences, but given the complexity of the social world social scientists certainly need to respect the difficulty involved in making the "plausible conjectures" of social scientific arguments reliable and respectable. Previously I compared measuring social class to measuring mass, suggesting that the former was far more complex. But Becker spends some time in this book pointing out that we have gone through great difficulties and large expenditures of time and money to do things like measure masses of fundamental particles. One explanation for the relatively lower level of success of social science research in comparison to some physical sciences may simply be that we aren't trying hard enough.

In any case, Becker says:

Quote:
I expect social science reports to consist of statements supported by reasonable arguments, and data that suggest plausible, believable conclusions. But I also expect, as a working scientist, that most of what we think is true will someday turn out to be not so true, to be subject to all sorts of variations our present formulations and data can't explain. I expect them to explain some part of the puzzle but leave plenty of work still to be done.
There's other interesting topics related to social science research methodologies that I've love to talk about: debates between quantitative and qualitative approaches, "inductive" vs. "deductive" research plans (hypothesis testing vs. the generation of new hypotheses from raw data), specific arguments in support of the validity of standard methods (e.g. surveys), and so on. But hopefully this is a useful broad apology for social science as science.
Data, Evidence, and Ideas: The Philosophy of Social Science Quote
04-15-2019 , 06:10 PM
Models of Inquiry: Linnaeus and Buffon

In his 2002 book The Politics of Large Numbers: A History of Statistical Reasoning, Alain Desrosičres highlights two historical scientists associated with differing models of scientific inquiry: Carl Linnaeus and Georges-Louis Leclerc, Comte de Buffon. I have not read this book, but I'm borrowing from Becker's summary of it in Evidence because I think it points towards a useful way of thinking about arguments between scientists who favor mostly quantitative methods and those who favor more qualitative methods. Plus the Comte de Buffon has a totally sweet name.

Quote:
Linnaeus proposed the use of a fully made classificatory scheme into which scientists could insert the information their research produced. Scientists completed their work when they filled all the slots in the classification scheme with data. Buffon proposed, on the contrary, to make the construction of the classificatory scheme itself the main job to be done, a job that would never end because, he thought, new and unexpected data would continually overflow the then-existing classificatory boxes, requiring rearrangements of ideas into new, until then unexpected, patterns and arguments. Both thinkers investigated animals and plants, but each used the information his research produced in different ways. (Becker, p. 24)
Stated this way, the Buffonian model may seem obviously superior; where do the Linnaean categories come from? But there are many cases where a priori classifications seem less arbitrary, i.e. if I begin some research by formulating a hypothesis about differences between men and women, and then go on to classify data according to gender, most of the time that will seem plausible. There are many similarly "common sense" ways of grouping characteristics that are employed in social sciences: age, education level, geographic location, ethnicity, and so on. Desrosičres (and Becker) associate this model of inquiry with hypothesis testing, and it's the model most familiar from introductory classes on the scientific method. As Becker puts it:

Quote:
In the purest, most classical form, a dissertation proposal reviews a collection of literature supposed to report on a coherent body of already gathered knowledge that has reached a point where the problem the student proposes to solve represents the next step on the road to an ever-growing system of established, law like propositions....

[S]tudents detail what everyone else (the "literature") has said about it then, most importantly, how their research will gather data whose analysis will resolve some existing disagreement and make it possible to adjudicate between rival explanation. (p. 26)
In other words, hypotheses emerging from prior research are like the Linnaean classification scheme, and the work is to see how new data fit into the scheme. This model of inquiry fits easily with ideas about falsification and hypothesis testing, and is often used in quantitative research, where the goal is to evaluate large, hopefully representative, datasets and analyze along relatively narrow lines to test various hypotheses. A great deal of the value of this model is in the ability to generalize conclusions from statistical significance over meaningful sets of data.

But, this model also highlights in an interesting way the problem of the "theory-ladenness" of observation, e.g. working in this model one will tend to pay attention mostly to the characteristics of phenomena which seem relevant to the hypotheses in question, while backgrounding other characteristics which may also be potentially interesting. In the social sciences -- where the phenomena to be investigated are so rich and varied, and causal analysis so difficult because of the sheer number of causal factors -- this is an important limitation.

The Buffonian model, by contrast, is similar to many qualitative methods, where researchers construct classification schemes in an iterative way by analyzing richer data without necessarily trying to reduce each case to a set of numeric dimensions (although qualitative researchers very often count things). The goal is to develop new insights into patterns in data, which may challenge existing classification schemes. The value of this model is that it may be somewhat less vulnerable to "theory-laden observation", and it can allow some deeper insights into the complexity of causal processes underlying some social phenomena one wants to understand. The drawback is that this model of inquiry is generally time consuming and thus often performed using smaller datasets: interviews, content analysis of texts, participant observation, and so on. It is less easy to assess statistical significance or the representativeness of conclusions. It is also probably less easy to ensure methodological rigor, or at least the considerations may be more subtle than the problems of rigor which appear in statistical analyses, though they are no less important. In sociology, a widely used "buffonian" methodology is called grounded theory, and it's the approach taken in the content analysis research I've participated in.

I hope that the way I've framed the strengths and weaknesses of the two approaches suggests that they ought to be complementary, each helping to address the weaknesses of the other. I would like to argue that a lot of suspicion of social science is rooted in suspicious of the "Buffonian" model of inquiry, and I think by viewing it within the framework of "data, evidence, and ideas" hopefully it may seem less suspicious. Especially if understood as having value in relation to statistical methods.

Here's Becker's take:

Quote:
The differences between the two models lie in how each relates to the data-evidence-theory circle. The "quantitative" model most often has trouble with the connection between the data and the evidence, with showing that the data really measure what the investigator says they should measure to be useful as evidence in the later argument. The "qualitative" model has trouble at the other end, with showing that the collected evidence, though based on acceptable data that are what they claim to be in relation to observed fact, is clearly related to the idea the investigator insists it embodies or demonstrates or is relevant to. Each approach has the advantages it claims, if you allow its premises, but each likewise has characteristic faults it prefers not to deal with unless it has to. (p. 35)
Data, Evidence, and Ideas: The Philosophy of Social Science Quote

      
m