Tangents and Pointers

Friday, April 30, 2010

"The package 'ubuntu-desktop' is marked for removal but it is in the removal blacklist."

While upgrading from Ubuntu 9.10 to 10.04, faced "The package 'ubuntu-desktop' is marked for removal but it is in the removal blacklist.".

Turns out this was because I had both ubuntu-desktop and xubuntu-desktop.

Solution: Removed xubuntu-desktop with

sudo apt-get remove xubuntu-desktop

Upgrade continued without any problem after that.

Source : https://bugs.launchpad.net/ubuntu/+source/update-manager/+bug/571743

Friday, March 19, 2010

svn: Expected FS format between '1' and '3'; found format '4'

Ok, in brief. Needed to create a local subversion repository. Used
"svnadmin create /home/nikhil/.svndata"
to create one locally. Tried to link this repo from Eclipse by entring
"file:///home/nikhil/.svndata"
as the repo location. Failed with the error
"svn: Expected FS format between '1' and '3'; found format '4'"

The reason for this is that my Ubuntu subversion client was version 1.6* and the Eclipse plugin was 1.5* compliant. The real fix is to either downgrade the Ubuntu client or upgrade Eclipse client. However a quick and dirty fix worked. Made the file
"/home/nikhil/.svndata/db/format"
editable from its original read only mode and changed the first line in the file from 4 to 3. Eclipse was able to read and write with the older client since then. Your Mileage May Vary.

UPDATE: A month later, no problems faced so far.

Friday, February 13, 2009

The security validation for this page is invalid. Click Back in your Web browser, refresh the page, and try your operation again.

I was working with SharePoint webservices when I ran into this very cryptic error message, considering especially the fact that I was accessing this through Java client code (using JiBX) and not through a browser. Even more interestingly, I had successfully invoked and accessed the Query operation from the Search service and the GetItem operation from the Copy service. This message started showing up in the XML Response to the CopyIntoItems operation call. Add to it the fact that I had no issues accessing the operation through SoapUI. Here are the request and response XMLS:

Request:
<?xml version="1.0"?>
<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/">
<Body>
<tns:CopyIntoItems xmlns:tns="http://schemas.microsoft.com/sharepoint/soap/">
<tns:SourceUrl>http://host.name.com/PATH/Template.txt</tns:SourceUrl>
<tns:DestinationUrls>
<tns:string>http://host.name.com/PATH/File.txt</tns:string>
</tns:DestinationUrls>
<tns:Fields>
<tns:FieldInformation Type="Text" DisplayName="mykeyword" Value="test value.." />
</tns:Fields>
<tns:Stream>b3VyY2UgdGV4dCBkYXRhIGZyb20gc2V2ZXJhbCBjb2RlIHBhZ2VzIGFuZCBlbmNvZGUgdGg=</tns:Stream>
</tns:CopyIntoItems>
</Body>
</Envelope>

Response:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<CopyIntoItemsResponse xmlns="http://schemas.microsoft.com/sharepoint/soap/">
<CopyIntoItemsResult>0</CopyIntoItemsResult>
<Results>
<CopyResult ErrorCode="Unknown"
ErrorMessage="The security validation for this page is invalid. Click Back in your Web browser, refresh the page, and try your operation again."
DestinationUrl="http://host.name.com/PATH/File.txt" />
</Results>
</CopyIntoItemsResponse>
</soap:Body>
</soap:Envelope>

So I was finally reduced to comparing the HTTP packets over the wire by my code and SoapUI. My inital guess was that this could be due to the way cookies were handled or something else with the NTLM authentication. But the real culprit was found to be the
SOAPAction: "http://schemas.microsoft.com/sharepoint/soap/CopyIntoItems. Since I had to use a custom webservice client, this header field was not set by the code. Setting this in the header solved the issue. There are details on the SOAPAction header here. What I do not understand is why does SharePoint take into consideration the SOAPAction header for only this operation and not the others?

Thursday, October 9, 2008

Bayesian Rating - how to implement a weighted rating system

I had saved this link in my Saved Links in my favorite social bookmarking site reddit. When I wanted it for reference the link and the site was gone. Luckily, Google cache to the rescue. So for future reference I am saving it here as is:

March 30th, 2006 in Basic Tutorials · By Markus Weichselbaum

Many web sites allow users to provide feedback on products, services or other users. In addition to verbal reviews, rating facilities are typically present that allow visitors to rate an item from 0 to 5 (often in conjuction with stars), from 0 to 10, or simply by voting + or -, respectively.

These visitor ratings are then often used to rank the rated items. And when “rank” comes into play, it gets tricky.

Ranking using Bayesian average

Hopefully the headline hasn’t turned you away yet – it smells of mathematical hardcore. But fear not, once you know how, implementing a robust rating and ranking system using the approach discussed here is really quite simple, very elegant, and most importantly, it works really well!

A basic example using simple + and - votes

In fact, the artworks in TheBroth gallery are visitor rated, using a rather simple + and - system. If you like an item, rate it “plus”. If you don’t like it, give it a “minus”.

The rating of an item would then be: number of positive votes divided by number of total votes. For example, 4 + votes and 1 - vote would correspond to a rating of 0.8, or 80%.

Now if you want to rank the items based on this simple equation, the following happens:

Assume you have on item with a rating of 0.93, based on 100s of votes. Now another new item is rated by a total of 2 visitors (or even just one), and they rate it +. Boom, it goes straight to #1 position in the ranking, as its rating is 100%!

A weighty issue

What we want is this:

If there is only few votes, then these votes should count less than when there are many votes and we can trust that this is how the public feels about it. In other texts this value is also refered to as “certainty” or “believability”.

This means, the more votes an item has, the higher the “weight” of these votes.

Thus, we want to calculate a corrected rating that somehow takes the weight of votes into account:

The more votes an item has, the closer the corrected rating value would be to the uncorrected rating value.
The less votes an item has – and this is the main trick here – the closer its rating should be to the average rating value of all items!

That way, new votes pull the corrected rating value away from the average rating, towards the uncorrected rating value.

There you have it – this is the main algorithm of what we call “Bayesian rating”, or rather “Bayesian ranking” as it is really about the relation of the item ratings to each other, based on the number of votes of each item.

Using a magic value

We now need to apply a “magic” value that determines how strong the weighting (or dampening, as some consider it) shall be. In other words, how many votes are required until the uncorrected value approximates the corrected value?

It really depends on how many votes the items get, in average. There is no point requiring 1000 votes for the item to rank 60% when each item only gets a handful of votes in average.

Thus, we could make this “magic” value exactly that, namely the average number of votes for all rated items, and voila, our Bayesian rating system is complete. By making the magic value dynamic, it will auto adapt to your system.

Finetuning the magic value

You could opt to create an upper limit to your magic value so that your doesn’t come to a grinding halt when there are many votes per item – an evergrowing magic value would make it less and less possible to actually influence the rating of a new item because it takes so many votes before you believe the rating of a new item.

The finetuning will depend on whether your system has a large influx of new items or not. If there are many new items added all the time, this influx will keep the average number of votes per item low. If your system has a fixed number of items, such as “rate your favorite star of The Beatles”, you may not need an upper limit. If you do add the occasional item, then an upper limit makes sense to give new items a chance to rate highly more quickly.

Bayesian rating for everyone

Now, lets summarize this all and provide a working formula for you to use in your code:

Bayesian Rating is using the Bayesian Average. This is a mathematical term that calculates a rating of an item based on the “believability” of the votes. The greater the certainty based on the number of votes, the more the Bayesian rating approximates the plain, unweighted rating. When there are very few votes, the bayesian rating of an item will be closer to the average rating of all items.

Use this equation:

br = ( (avg_num_votes * avg_rating) + (this_num_votes * this_rating) ) / (avg_num_votes + this_num_votes)

Legend:

avg_num_votes: The average number of votes of all items that have num_votes>0
avg_rating: The average rating of each item (again, of those that have num_votes>0)
this_num_votes: number of votes for this item
this_rating: the rating of this item

Note: avg_num_votes is used as the “magic” weight in this formula. The higher this value, the more votes it takes to influence the bayesian rating value.

How Bayesian Rating is used in TheBroth

We use it to show the “highest rated artworks” in order. We wanted to avoid that a new artwork with 1 vote immediately jumps to first place, as its rating would be 100%. Using Bayesian rating, its starting rating with one positive vote would be a little bit higher than the average rating of all items.

Resources

Thursday, May 29, 2008

Google App Engine with Eclipse and PyDev on Ubuntu

Google App Engine just opened up a slew of possibilities for me. I was looking for some free app hosting for a while just to try out different languages and frameworks. I was not lucky enough to squeeze in through the door when they launched the App Engine, however I did get the access when Google accepted more developers during Google IO.

I fired eclipse and was about to head to pydev when, purely out of habit, I googled to check if anyone had blogged the steps. And indeed, I found this:
Google App Engine & eclipse (PyDev)

Since I am on Ubuntu the only difference worth mentioning was the step to specify the Python Interpreter.
In step 5, instead of
C:\Python25\ python.exe
I selected
/usr/bin/python2.5

Now all that is left it is to let the creative juices flow!