Cassandra Example Code is Driving Me Crazy

February 1st, 2012 No comments

I’m updating my machine learning project, and I’m trying to add Cassandra as a datastore… but it’s difficult, because Cassandra is not my goal here, and Cassandra wants me to have it as a goal.

“Updating” means “rewriting from scratch,” incidentally. The good news is that it’s actually working better now than it did.

What do I mean by having Cassandra as a goal? Well… think about the project checkpoints so far: train on a corpus (a “body text”), classify corpora properly with unit and module tests, be able to persist training data.

Nowhere in these checkpoints is Cassandra a requirement.

It’s not a bad dependency, of course; Cassandra’s pretty neat, which is why I chose it in the first place, and it’s a skill set I’d like to improve.

“Improve” means “create from whole cloth.” I’m a Cassandra user like a bicyclist is a space shuttle pilot.

Therein lies my problem: the learning curve for Cassandra is destroying my will to live, so to speak.

I guess what I want is a set of examples from whole cloth. I can put together pieces: starting Cassandra (for tests and for “production”); I can programmatically create keyspaces; I can even programmatically create simple column families, and update rows.

Queries are a problem, because it’s taking me a while to really grasp what slices are. I’m getting there.

But what I really wish is that someone would take a simple set of examples, and show them from start to finish, clearly explaining what’s going on – with full code.

The example code I see probably works, you see, but it’s not self-contained – all the examples have a ton of references that aren’t part of the actual example code.

Please, people – give us examples we can understand.

Totally cool new unexpected site – not mine, BTW

January 20th, 2012 No comments

I just ran across weinersmith, Zach Weiner’s wife’s site, and I’m completely and totally engrossed.

Zombie ants. Who knew?

Categories: general Tags: , ,

New story: The Messiah

December 24th, 2011 No comments

Since it’s Christmas, I just published a new story, “The Messiah,” based loosely on the story in Luke of Jesus’ birth. I’ll warn you ahead of time: It’s not a happy story.

Categories: general Tags:

Eunice Prevatt, 1921-2011

December 9th, 2011 No comments

My grandmother passed away a few days ago. She was one of the best people I’ve ever known.

This is something I wrote for her shortly after finding out she’d passed away.

A gentle heart and a kind word
Faith in God and hands to hold
Support for any well-meant deed
This is what she meant to me.

Inspiring and encouraging
Upholding and uplifting
Guiding, driving, serving, free
This is what she was to me.

She’d seen so much and kept alive
A light on a hill most high
Shown to all, God’s grace to see
This is what she meant to me.

I cannot help but hold her there
In mind, in heart, in prayer still
She held more grace than I can see
This is what she was to me.

Rest in peace in Heaven’s kind embrace, Granny.

Categories: general Tags: ,

Lab of the Dead

November 15th, 2011 No comments

I’ve wasted a few hours finishing an odd game called “Lab of the Dead,” a game in which you play a scientist doing experiments on zombies in an attempt to find a cure for their condition. It’s an oddly amusing game, for various reasons, and if you’d like to play it, do so before reading too much – I’m going to mention the ending.

The game has subject matter warnings, mostly because you’re basically assaulting corpses. Some language, but nothing worse than you’d see on most television.

Basically, the story is this: Zombie Apocalypse! You’re a scientist in an underground bunker, with an endless supply of test subjects (i.e., former humans) with a largely self-assigned goal of determining a cure. To do so, you have various weapons, food supplies (for the zombies), and objects, with which you poke and prod and tempt the zombies to ascertain and develop their reactions.

As noted in “I Love New Words:” See those zombies over there? You should probably get away from them.

Along the way, you can research new items, as well as the subjects themselves (“this one likes candy,” along with some others). The items range from animals (dead and alive), weapons (chainsaws and crossbows, for example), and objects (guitars, stuffed bears, etc.).

Weapons are used to attack the zombie (with some attacks resulting in termination of the test subject). Many of the weapons yield permanent damage (which can affect how the zombie responds to other objects).

The weapons were amusing in a slightly horrific way. The game’s maturity level is affected by the constant (and often necessary) mutilation of the zombies; I found this rather disturbing, but the game makes clear that the zombies are dead; it’s supposed to be working with dead tissue that happens to respond, at that point.

And therein lies the amusement. The game’s basic premise is that you are basically applying behavioral psychology to zombies, in an attempt to find a cure for a weaponized virus.

It does not end well for anyone; at one point, lacking support from the military sponsors who are guarding you, you actually kill off everyone with you (which gives you more test subjects, naturally, as well as some other less savory things to work with.) Even after you’ve accomplished the game’s goals, you… basically give up, doing whatever you want until you decide to stop playing.

It’s pretty obvious from early on that this is the case. You’re the one scientist, using prior research that clearly lays out the viral nature of the zombies’ genesis. Your cure would rely in genetic engineering as well, or at least something to block the function of the virus that causes zombification.

Yet you’re using behavioral psychology! You’re saying, “given stimulus X, with this precondition, I will get result Y,” throughout the whole game. You can “teach” the zombies slightly, in that the results establish a pattern for the test subject (i.e., give it a doll enough and it plays with it appropriately), but at no point do the zombies indicate anything more than a very passing concept of past, future, or anything.

They’re basically reptilian brains: eat, eat, eat. Only when satiated do they really change behaviors. Regression is trivial; train it to play a drum, for example, then beat it with sticks, and its behavior goes back to raw aggression with no memory of knowing how to play a drum until you repeat the “training” by treating it nicely.

That said, it’s an amusing game. Not a fun one, as such, in that now that I’ve finished it I’m not likely to ever play it again, but I was curious as to whether the authors were going to decide Behaviorism was a way to “cure” zombies – and no, it wasn’t.

Tips on Writing Articles

October 31st, 2011 1 comment

As editor of TheServerSide.com in the past, I had a lot of opportunity to see what worked in online writing and what didn’t. Since it was a core part of my job to be efficient online, I also did a lot of research into various techniques, and while I certainly can’t claim to be an SEO expert of any kind, I can describe some things that can help online writing be more efficient.

This was originally written for TheServerSide.com, and was linked on the post submissions page; since then, they’ve mangled the formatting of old posts so it’s nearly unreadable, although the .Net version of the article is still formatted properly.

Naturally, since it was written back in 2007 or so – probably earlier, honestly – I realized that it, too, could have used improvement. As a result, I copy/pasted the original, edited it, and added a whole section on the process I use to write. I’ve also removed the “TheServerSide.com” references, as these are general suggestions and I don’t work for TSS any more.

Tips on Writing

Writing articles for the web is a lot like writing anything else, with the main differences being that you have fewer limitations, and thus even more chances to lose your audience.

When you write online, it’s being written for posterity. Thus, write it as well as you can. You want to be very clear, and you want to make sure you’re not assuming things on the part of your audience.

Don’t say things in passing. Be specific. Your audience doesn’t have the time or interest to try to figure out what you meant. Trying to be fancy will only make you less effective as a writer. There’s no issue with having a personality or a style in your writing, but if the dominant feature in your writing is your style (and not the content itself) then you’re going to lose your audience.

Losing your audience is a bad thing.

The first sentence in your article is by far the most important. It needs to communicate why a given article should be read. If that one sentence is not effective, your article will start off with a smaller audience, because you will lose readers immediately.

Getting your first sentences right is by far the hardest thing to do correctly in writing online, in my experience.

The first sentence in each paragraph is also important. On the web, readers tend to scan more than read. If you put the topic sentence after the first sentence in the paragraph, chances are good that the readers will not reach it. They will have moved on to the next paragraph, or perhaps they will have stopped reading altogether.

It is all right if the subject of a paragraph cannot be completed in one sentence, but the topic sentence should be enough to communicate what the paragraph means. If the reader wants more explanation, the rest of the paragraph exists to provide it.

Short sentences are good. Long sentences are fun to write, and they are often quite natural for authors, but they are not efficient for web readers to scan.

Bwahahaha*gag*. As usual, I tend to pick on myself in my writing. I’m guilty of violating almost every tip I have – which is part of why I know what works and what doesn’t – and I have a bad tendency to make a statement and then violate it to illustrate the point. Your mileage may vary as to whether it works.

Don’t use emphasis techniques like bold or italics any more than you absolutely need. If you feel that bold is necessary to make your point, then it’s likely that your sentence or paragraph is organized poorly. Highlights like bold or italics draw the eye long after the bold text is read, and the highlights actually lower comprehension.

Code samples are great, and usually required, but make them clear and complete. References to third party sources are all right, but the readers are best served by full code. Boilerplate code, like accessors and mutators, can be ignored with a comment, but it could be more effective if you used properties directly for simplicity … or, instead of including a long list of accessors and mutators, use Lombok. If you’re using Java, that is.

Remember that writing online is still writing. The writing process rarely lends itself to single drafts. It can happen, but it’s rare, and usually not effective.

For efficiency and good writing style, follow a set of simple tips:

  • Make an outline.
  • If the first sentence of a given paragraph isn’t enough to understand what the paragraph is about, rewrite the paragraph.
  • Make sure the article matches your premise! If the subject of the article is “Object Databases and Efficiency,” don’t spend half your text discussing the failures of relational databases. Talk about object databases and efficiency instead.
  • Make sure your spelling is correct.
  • If you use a word processor that provides grammar checking, you should allow it to suggest changes. Check your reading level, Flesch-Kincaid scores, and other data you can. The average reader wants to read at a sixth-grade level. If you consistently score much higher, your article will be hard to read. On the web, ‘hard to read’ usually means ‘unread.’ (This document, as originally published, received a Flesch-Kincaid grade level score of 7.8, for example. I haven’t checked the revised version, because I’m afraid to.)
  • Have people read your draft, and listen to every suggestion they offer. Don’t offer it only to experts; offer it to willing newbies, too. It’s all right to decide not to use suggestions, but your wider audience is going to think of many of the same things your test audience tells you. Constructive criticism is good, especially if it’s received before the article is published.
  • Avoid parentheticals like the plague (the plague is bad, no matter which plague it is.) If you have to use parentheticals, go ahead, but try hard to not overuse them. (They’re hard to read, and break the flow of text. Plus, they’re annoying.)

What I Do When I Write

I actually mind-map most of the things I write, with Freeplane, an open-source (and free) mind-mapping tool. I then draft a rough map for the article.

Why Freeplane over Freemind? Excellent question. I just found Freeplane to be slightly more user-friendly. Thus: it’s personal preference. The main point is the suggestion to use mind-mapping, not to use specific software.

My central node is, naturally, the subject. (This article would have “tips on writing.”) Then I create child nodes for the central points that I think I need to make – the things without which I’d decide the article wasn’t worth reading.

This becomes my structure. If I have a child node that isn’t something I have to have, then it’s extraneous to the document, or my subject thesis isn’t correct.

These first vertices are the most important part of writing an article, to me; the rest I can usually figure out as I go, if I have to, because my important points (the second-level nodes of the map) should be clear enough and relevant enough to use as a guide for everything.

I usually fill out another few levels for each of the second level nodes, too, though, because they also should have supporting thoughts associated with them. (Otherwise they’re not supported; they’d better stand on their own, then.)

Then, I draft the article itself. The space each supporting statement should get should be roughly analogous to the size of the corresponding node in the graph; if a node in the graph is very short, yet the text for that node is very long, then perhaps my graph isn’t very complete… or perhaps the section is getting too much emphasis, which is by far the more common case. (Or, alternatively, I’m showing code, which drastically affects the size of a block of text. No way around this, sadly, other than hiding the code or providing a link to an external page, neither of which is effective.)

If it’s stated simply in the mind map, there’s no reason it shouldn’t be stated simply in the final product. (Corollary: state things simply in the mind map.)

Then I reread the article, a lot. I find willing victims as often as I can, and make them read it; usually I get unuseful and generic responses like “it’s good,” which boosts my ego some but, realistically, those responses don’t do much for me or for the article. I’m looking for constructive criticism, questions that come from the reader, things like that.

Remember: it’s a draft. As long as you keep that in mind, barbs thrown at you from readers won’t sting much. If they do, well, sorry – listen to your readers. Maybe you won’t factor in what they say (remember the list of tips earlier in this article?) but there’s usually a reason they think what they think.

Sometimes a point made by a reader emphasizes what you wanted to have happen – maybe you wanted the reader to wonder something, you know? If the comments are in line with what you desired to happen, well, that’s a win. If they’re not, well, that’s why you draft and that’s why you rewrite.

The concept is that a mapped article, if it’s able to be graphed properly, will naturally have a better, more cohesive structure for your readers, reducing signal-to-noise and guiding the author’s efforts. A draft process, detached from your ego, is designed to make sure that you aren’t using your writing as pure ego, which makes it more broadly appealing and useful.

Conclusion

I’ve spent a long time learning how to write well online, with varying results. I, too, am ego-driven, after all. However, I love to read well-written stuff, and as an editor for online and dead tree material, I’ve learned what works and what doesn’t. Hopefully you will find these tips and processes useful; I’d love to hear about alternative processes, too, because there’s more than one way to dig a hole.

Categories: Art Tags: , ,

I miss E-mail.

October 11th, 2011 No comments

I miss asynchronous conversation.

I miss the ability to have an actual thread of thought preserved in something less ephemeral than memory, or in some chat log somewhere on one of my systems’ hard drives.

I miss the ability to not be there if someone has an observation I’m interested in. I don’t want to have to observe in real time.

I miss email. If someone has something to say, is it that hard to write it in such a way that it can be understood clearly, with topics and explanations?

I say no.

Ironically, I say this on a blog, whose sole medium is the constructed and preserved thought… which means it’s going to be missed by the very audience from whom I’d prefer asynchronous communication.

I use Twitter, but not Facebook; my use of Twitter isn’t “normal,” I think, and it’s fairly inefficient.

I can make 140-character thoughtlines, I think, but they lack a core representation of my personality in them. While I recognize that the point is the message and not the messenger, often the messenger creates the message not as a set of words, but with the force of personality and intent.

The message is the thing. The messenger makes the message, and becomes part of it.

Twitter’s limitations on messages forces their very tight focus, which is a good thing – it’s an excellent training ground for learning how to focus what you say – but tight focus lacks conviction.

I miss the chance to see that conviction.

There’s social commentary here, too, even if I don’t know how to frame it well. Recently, I had an email exchange with someone, and he complained that I had taken too much time to explain my position on something, that I clearly wasn’t focused on my responsibilities if I had time to explain myself in detail.

I was horrified and amused – the dismissiveness was funny, really, but the intent behind it was not so good.

I still don’t know if what he meant was that my reasons were specious, or that he had no interest in reasonings. (It’s my personal feeling that convictions establish the meaning behind what people think; I can accept the silliest concepts from people who have reasons to hold them, even if I don’t agree with them.)

I miss email.

A lot.

Thanks for allowing Backup to mangle my drive, Windows!

October 4th, 2011 No comments

With my main working laptop, I’ve had a real issue with the DVD lately – put in any disk with data, and it’s almost impossible to use. Not just the disk, Windows 7 itself.

The disk spins up, down, tries to read, and basically gets stuck in this cycle until I eject the disk in frustration and try something else, which usually means copying the disk contents to a network drive, and using the data from there.

As you can imagine, this is maddeningly slow and frustrating.

Today, I figured out the problem: Windows Backup. I’d started a backup cycle, and never finished it (because I used a different backup procedure) – and Windows Backup was apparently checking every disk as I put it in, and trying to figure out if it could use it as a backup medium.

Note that I am assuming that it’s analyzing the media – I don’t know for sure. All I know is that I couldn’t use any disk unless I was very lucky and in the mood to let lots of time pass.

So today I finally went in and worked out the Windows Backup process, eventually cancelling it entirely – and behold! I can use my drive again.

Much joy ensues.

Categories: Programming Tags: , ,

Protected: But of course…

September 26th, 2011 Enter your password to view comments.

This post is password protected. To view it please enter your password below:


Categories: Art, Music, Psychology Tags: , , ,

Testdata-generator project

September 26th, 2011 No comments

I ran across a mention of Testdata Generation on Twitter this morning, and was intrigued.

How did it work? More importantly, how would I use it? How well does it work for what I’d use it for?

Basically, what it does for my purposes is provide a generator for two types of classes: data and services. (It actually generates more than just these, but these are what I would need.)

So I threw together a quick Datum object, defined like this:

package testproject;

import lombok.Data;

@Data
public class Datum {
  private String firstName;
  private String secondName;
}

I’m using Lombok to remove all the boilerplate Java code. Basically, Datum has accessors for all nontransient fields, a toString(), and equals() and hashCode(), based on the @Data annotation.

My “test” simply had the following code:

package testproject;

import ch.nerdin.generators.testdata.TestData;
import org.testng.annotations.Test;

public class DataTest {
  @Test
  public void testData() {
    Datum datum = TestData.createBeanInstance(Datum.class);
    System.out.println(datum);
    datum = TestData.createBeanInstance(Datum.class);
    System.out.println(datum);
  }
}

This actually cranks up a Spring context (I’m not sure what the scope is) and populated two instances of Datum, so my System.out output looked like this:

Datum(firstName=sspepbtujm, secondName=agqrvaoswe)
Datum(firstName=zndttryqyü, secondName=üasabkdkuo)

Subsequent runs gave me different data; it’s not consistent yet. In fairness, though, it has a dbUnit generator as well, so you could use that to create consistent data in a dataset. (I’ve done this sort of thing with XStream, see “Fun with XStream” from way back in February 2011.)

Looks like a very cool project.

Incidentally, it imports Spring 2.5.6, via spring-context and spring-aop. This brings in spring-core and spring-beans.

I do everything with 3.0.5, so my Maven pom.xml was fairly long due to having to exclude the Spring 2.5.x dependencies. Using the 3.0.5 version of spring-context worked just as well for the dependencies, and without a hitch.

Here’s my project file, just in case you’re interested:

< ?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0

http://maven.apache.org/xsd/maven-4.0.0.xsd">

    <modelversion>4.0.0</modelversion>

    <groupid>testproject</groupid>
    <artifactid>testproject</artifactid>
    <version>1.0</version>
    <dependencies>
        <dependency>
            <groupid>org.projectlombok</groupid>
            <artifactid>lombok</artifactid>
            <version>0.10.0</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupid>ch.nerdin</groupid>
            <artifactid>testdata-framework</artifactid>
            <version>0.10</version>
            <exclusions>
                <exclusion>
                    <groupid>org.springframework</groupid>
                    <artifactid>spring-aop</artifactid>
                </exclusion>
                <exclusion>
                    <groupid>org.springframework</groupid>
                    <artifactid>spring-context</artifactid>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupid>org.springframework</groupid>
            <artifactid>spring-context</artifactid>
            <version>3.0.5.RELEASE</version>
        </dependency>
        <dependency>
            <groupid>org.testng</groupid>
            <artifactid>testng</artifactid>
            <version>6.1.1</version>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <artifactid>maven-compiler-plugin</artifactid>
                <version>2.3.2</version>
                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>
Rss Feed Tweeter button Linkedin button Digg button Stumbleupon button Youtube button