tomo's blog

Improved Google Spam Filter?

Submitted by tomo on February 26, 2011 - 10:08pm

Google, in response to the flood of recent concern about spam/content farms showing up in their results, have just announced a big change in their system of algorithms which calculate page rankings. They had previously published a Chrome plugin that lets you manually block results, and Google says the new algorithm blocks some 84% of the same sites that people were blocking with the plugin. I guess some people were controversially blocking non-spammy sites, rather than guess that Google's algorithm isn't good enough. Or isn't it?

Matt Cutts, the main anti-spam guy at Google, says the new algorithm change affects 11.8% of queries. Since the change is only effective in the US right now and I can browse from both Vietnam and the US, we can compare results and some one in eight queries should be improved.

So I tested "dog shampoo" out of the blue. I have never had a dog because I think they smell.

In Vietnam, high ranking results included drnaturalvet.com which had a low quality page of filler about dog shampoo and dogshampoo.info which is clearly a made-for-adsense site. In the US, the drnaturalvet link is much lower, but dogshampoo.info maintains the same high position. A link to content farm ehow.com is also lower now. And a link to dogshampoo.co.uk, a made-for-adsense site with nothing about dog shampoo at the time of indexing (see cache) is now gone too.

A search for winrar came up with fairly similar results in either country, and both maintained links to spam sites like software.informer.com.

A search for "tightvnc server authentication successful closed connection" punished duplicate content site pinoytech.org slightly but another duplicate/copy site efreedom.com maintained its position in the top 20. Both copy the StackExchange site SuperUser.com.

So it seems that the new algorithm change is an improvement, but I don't think it goes far enough to filter spammy results. While it may be a slight setback for those guys, they are still in the running and will be emboldened to try to rank higher.

There may still be a need for users to crowdsource a database of filtered spam sites until further algorithm improvements.

Note: The Atlantic did a similar test from India on "is botox safe" and "drywall dust" and found their results to be much improved.

After months of disparity between the official Vietnam dong/US dollar exchange rate and the rate commanded by the black market, the State Bank of Vietnam has weakened the local currency to a range surrounding 20693. This is the fourth decrease in 15 months, the sixth in two years. The move by the central bank came expectedly as there had been great pressure to devalue in the months leading up to Tet with a promise from the government not to do so before Tet.

This won't be the last devaluation for 2011.

For one, the new rate is still below the 21000+ the black market had been demanding before the change. Previously, the dollar would fetch up to 21300 VND at jewelry shops.

Secondly, the pressure on the dong due to rising inflation and the trade deficit remains -- and foreign currency reserves are still low, limiting how much the central bank can artificially support the currency. Inflation remains in the double-digit range after a lull from the global economic crisis and crash in oil prices. Gasoline prices will either continue to increase, contributing to imported inflation, or the government will continue to deplete its reserves to subsidize gas. The trade deficit, still high although there are signs of it lowering, is also dependent on now officially higher import prices as most manufactured exports rely heavily on imported materials. However, high prices will also discourage imports. High inflation will force rises in wages, making manufacturing in Vietnam less competitive, and adding to pressure to weaken the dong to make exports cheaper especially in a stalling glocal economic growth situation.

Economic consultant Bui Kien Thanh predicts a CPI increase of 0.15% for every 1% drop in the value of the dong, or a 1.4% increase in the CPI due to just the latest dong drop.

Third, Vietnamese people still lack confidence in their local currency and no plan has been announced to address any of these issues or for delaying further devaluations for any time. After a period of dollar scarcity it's now possible again to buy dollars from banks. Concerned about rising inflation and expecting continued high inflation, Vietnamese will continue to rely on assets they view as inflation hedges such as gold, real estate, and the dollar. These large one-off devaluations do nothing to increase people's confidence in the dong and the focus appears to be on rapid economic growth rather than macroeconomic stability. The central bank has stated they will adjust the official rate more frequently and flexibly, which should increase confidence in the local currency.

As a fourth factor, 12-month non-deliverable forwards for the currency had been in the 21000-22000 range for the previous 12 months but have shot up to the 23700s as of February 11th when the new rate was announced. One interpretation is that the dong will remain steady for the next 12 months, but as this latest devaluation isn't baked into past future predictions another interpretation is for the dong to drop another 11% in value by this time next year. The 12-month non-deliverable forwards have not always accurately predicted when devaluations would occur and the size of that market is small, allowing a small number of traders to have a large impact on the price, but they have been an indicator.

So what steps should Vietnam take? I think primarily we need to stabilize inflation with the important side effect of increasing confidence in the dong even at the expense of faster short term growth. Part of inflation is based on worldwide oil prices out of Vietnam's control, but we can try to be less dependent on oil and shift some responsibility to the Vietnamese free market by reducing subsidies for transportation fuel and we should be developing public transportation. Luxury imports could be discouraged. As a nation highly susceptible to prices on imported goods we should encourage export industries that are less dependent on imports and particularly encourage lending to these sectors while otherwise raising interest rates relatively to cool down other parts of the economy. We should consider the increase in food prices such as rice, which we are actively exporting to other countries, and we should be cognizant of the renewed housing boom which is making housing more expensive for everyone.

Vietnam ISPs

Submitted by tomo on February 16, 2011 - 4:19pm

With the recent instability in Facebook access and some people reporting it blocked on one ISP or area of Vietnam while others can access it just fine... Is it time for you to rethink your choice in internet service provider?

There are a dozen or so ISPs in Vietnam but I've only looked at the biggest: Viettel, VNPT, FPT, SPT, and Netnam. They also all provide fiber internet (FTTH - fiber to the home) and not just ADSL. Here are their prices for selected packages:

Read the rest of this article...

I've tracked down a source of the bug which breaks jquery (1.2.6) in FireFox (Chrome is fine) where you'll see a debug message of "z.indexOf is not a function". If you're running a minified jquery then the line number won't help locate the bug, but in this case it was around line 1715:

type == "^=" && z && !z.indexOf(m[5])

This code is triggered by jquery attribute filters like ^= (starts with) or *= or ~= and in this case I found that if z had been 0 then the code which checks "&& z" would short circuit and not try to reference indexOf of z.

Looking deeper, I found that z == -1 (not a string) and that this was because I was filtering on the 'value' attribute, and that in FireFox, the 'li' node was being given some value of -1. You can check this by running "$('li')" and checking out the returned values. In Chrome, there is no value. This difference causes a bug in FireFox.

One workaround is to use only use attribute filters when using selectors that select for specific tags which exclude 'li', at least for filtering on 'value'. For example, use 'input[value^=whatever]' instead of just '[value^=whatever]'.

PHP 5.2 has support for showing the percentage uploaded for a file upload in progress. If you're not running Apache as your web server,

Drupal's FileField module automatically detects for and uses upload progress support on the server end. This can either be APC (Alternative PHP Cache) with rfc1867 support or with the uploadprogress PECL extension. In Drupal, the upload progress bar looks like this:

You can check to see if you already have support by going to admin/reports/status.

If the report shows that your server has support yet FileField CCK fields aren't updating the upload progress bar then your server has a problem.

Read the rest of this article...

Getting Drupal to stream video using PHP and the FlashVideo module to manage video uploads is not easy. It involves 5 distinct pieces of software, which means 5 places where things could go wrong with little error logging.

Here's what you need:
1. FlashVideo http://drupal.org/project/flashvideo
2. ffmpeg
3. flvtool2
4. xmoov-php
5. JW Player

Read the rest of this article...

Server Colocation and VPS Hosting in Vietnam

Submitted by tomo on January 23, 2011 - 6:30pm

I used to run a hosting company back in the states. I wouldn't want to get into that business again as it's capital-intensive (for a tech company) and highly competitive.

Looking at the server colocation market in Vietnam it seems small and expensive for what you get, and maybe there is far more demand from Vietnam-based businesses to host in the US because that's where their customers are. But companies that aim for the domestic market need to consider hosting locally since the Internet link to the US is relatively narrow and expensive with high latency.

When I searched for colocation servers in Vietnam, the #1 result, vn84.com, was down; webhosting.com.vn's account has been suspended; aacecom.net is now a parked domain just showing ads. Not very good results.

After distilling the first 100 or so results on Google:

Conclusion: FPT is definitely a stable business, yet prices aren't out of line, with higher upload speeds. It's always best to avoid anything priced in USD since the Vietnam dong is certainly going to weaken against the dollar soon. But even the cheapest colo at 1.3 million VND is more than I can lease a dedicated server for in the US.

So how much are dedicated servers at these places?

Oddly, for pavietnam.com the price to lease a dedicated server is cheaper than to buy and colo your own!

Virtual Private Servers are another option. Let's compare:

They come out to roughly a third of the cost to colo. You'd be better off finding a friend to chip in and get pavietnam's dedicated server deal.

UPDATE April 3, 2012:
There are a lot of small providers in Vietnam outside of the big ones listed above. If you want to check some out, many even have free trials. Check out the forum at vn-zoom.com if you can read Vietnamese.

For colocation, Viettel's IDC "Sóng Thần" datacenter in Binh Duong is the largest in Vietnam, and possibly all of southeast Asia.

When considering Singapore as an alternative hosting center, be aware that generally prices for VPSs in Singapore are significantly higher than what you would pay in the US, while latency is still around an order of magnitude (10x) that of a Vietnam-based host, while still 1/2 to 1/3 of a North American host. YMMV.

Google Spam / Content Farm Filter

Submitted by tomo on January 21, 2011 - 3:06pm

There's been a lot of talk about the decrease in quality of Google search results over the years due to spammers / content farms with strong SEO skills. I'm glad I'm not the one who's been annoyed by this.

Google should know which sites are spam, content farms, or duplicated content. That they aren't properly filtering or demoting them could be due to a conflict of interest - they make money from the ads on those crap sites.

But we, as individuals, can easily distinguish the spam results from the quality ones and we do so everyday. If only there were a way to stop duplicating this effort.

If Google won't do this for us, then we can do this ourselves.

Here's what I want:
1. When I've been tricked into opening an ad-filled page without meaningful content, I want to go back to Google and mark that link as "spam", have that noted somewhere in the cloud so I can access it from any computer, and have future search queries filter out that link.

2. I probably don't want to see any pages from that domain show up on any other queries.

3. I probably don't want to see any pages that my friends have also marked as spam.

4. I probably don't want to see any pages that friends of my friends have also marked as spam.

5. I may even want to befriend / "follow" strangers just because they're good at marking spam.

Read the rest of this article...

Pliggmeme Redux Pligg Theme

Submitted by tomo on January 17, 2011 - 2:21pm

Pliggmeme Redux is a slightly modified, debugged and fixed, and despammed version of the Pliggmeme theme for Pligg. The original Pliggmeme was inspired by Tweetmeme and based on skins4webs's Silverbullet theme. The new Pliggmeme was made for the Detroit Institute of Techno.

Some shady hidden spam was removed, some JavaScript was fixed, and you can actual submit stories and make comments now. This works with Pligg 1.1.3.

Read the rest of this article...

Easy Access to Facebook in Vietnam, Part Deux

Submitted by tomo on January 14, 2011 - 2:58pm

Here's an easier way to access Facebook again in Vietnam, as a follow-up to my earlier post Bypass Vietnam's Block on FaceBook - or China's Block on YouTube.

Just change your DNS settings to use 65.111.171.175 as your DNS server like before. I've set up a DNS server which returns different IP addresses for facebook (facebook.com and fbcdn.net). You can also do this on your computer by setting entries for all subdomains of facebook.com to 153.16.15.71 and for fbcdn.net to 60.254.175.11.

For detailed instructions on changing your computer's DNS settings refer to http://code.google.com/speed/public-dns/docs/using.html but remember to use the address 65.111.171.175 instead of Google's 8.8.8.8 and 8.8.8.4.

Read the rest of this article...
Syndicate content
© 2010-2014 Saigonist.