Get domain from URL

How to get the domain from the URL? It depends!

Lately i’ve spent some time trying to figure out the best way to solve this problem.

Scenario: a website reachable through two different second-level domains (and a bunch of third-level domains). No redirects from a domain to the other, or from the third-level domains to the second-level (and this behaviour couldn’t be changed). The two SLD have their own virtual host configured on Apache (this detail is very important, as you will see).

Please note: the following possible solutions consider PHP, but i guess that, apart from the different syntax, the logic would be the same with any other language). I’m not a programmer anyway, so won’t put much code here (feel free to add it in the comments, if you want).

One possible solution is to get the server name:

<?php
$_SERVER(‘SERVER_NAME’);
?>

and then take the last two strings starting (separated by a dot) from the end.

So, if SERVER_NAME is www.mydomain.tld, you would get mydomain.tld, which is the second level domain.

This solution can be good enough if you know in advance you are not going to use it with domains including a dot, like co.uk, com.mt or com.au, just to name a few.

But if you have the website accessible through google.com and google.co.uk (the first example coming to my mind, i wonder why), this kind of solution would return google.com and co.uk. Not exactly what you’d want.

A more sophisticated solution would be to check the TLD against a list (there is one here but it’s not complete). If you have a complete list of TLD, you can get the SERVER_NAME, check what TLD is in it, and pick up the part of the hostname before the TLD (plus the TLD itself, of course).

For both the solutions above, you can find a lot of code snippets on Google.

But my favourite solution is the third! In fact, you can set in the virtual host (in the two virtual hosts, in my case) on Apache a variable defining the domain:

<VirtualHost>
ServerName www.domain1.tld
SetEnv MY_DOMAIN domain1.tld
</VirtualHost>

<VirtualHost>
ServerName www.domain2.tld
SetEnv MY_DOMAIN domain2.tld
</VirtualHost>

This way it’s Apache defining the exact domain value, and at this point you can get the variable in php with a simple

<?php
$_SERVER(‘MY_DOMAIN’);
?>

For the record, i needed to use the variable to create a cookie valid for the second level domain and any subdomain of it. So, once defined the variable in the virtualhost, all i had to do was something like this:

$domain = $_SERVER['MY_DOMAIN'];
if(isset($_GET['parameter'])) {
$variable = htmlentities($_GET['parameter']);
setcookie(“mycookie”, $variable, time()+(60*60*24*7), “/”, $domain);
}

In case you will find yourself in the same situation, hope this saves you some time.

P.s. as you see now, being able to edit the virtual host is essential to use this solution.

Posted in Software | Tagged , , | Leave a comment

WordPress and XML sitemaps plugins

If you have a WordPress with multisite feature enabled, you may have experienced problems in finding the right plugin to generate a XML sitemap to submit to search engines.

I usually use Google XML Sitemaps, maybe the most used plugins to generate XML sitemaps on WordPress. Unfortunately, this nice plugin doesn’t work on WP Multisite. And it doesn’t generate multiple sitemaps.

If you want a XML sitemap plugin to generate sitemaps on your multisite wordpress, you want try WordPress SEO by Yoast, as far as i know the only plugin that works well in generating a sitemap on a WP multisite website.

But if you have a huge website, with thousands and thousands of URLs in it, you may have another kind of issue. In fact, none of the above mentioned plugins generate multiple sitemaps (and the sitemap index, of course) in case you have more than 50.000 URLs to list. And by the way, 50.000 is the limit in the protocol, but Google seems not to love sitemaps with more than 10.000 URLs listed. If you have this issue, you should try Strictly Google Sitemap, a plugin that allows to generate multiple sitemaps (and with great performances!). Only problem i found out using this plugin is that the permalink structure must include some numeric value (%post_id%, for example), or the sitemap generated won’t be correct.

And if you have a WP multisite with some of the websites in the network with more than 50.000 (or just 10.000) URLs? I’m afraid we have to wait for it: i couldn’t find any.

Posted in Blog, Software | Comments Off

Pidgin cannot connect to MSN: the certificate chain presented is invalid

The certificate for omega.contacts.msn.com could not be validated. The certificate chain presented is invalid.

If you have an error when trying to connect to MSN messenger with your pidgin today, this is the easy and quick way to fix the problem: just delete the contacts.msn.com SSL certificate.

rm ~.purple/certificates/x509/tls_peers/contacts.msn.com

This way, pidgin will download again the SSL certificate and everything will be working again.

update: check comments for more other possible fixes

Posted in Software, Ubuntu | 59 Comments

WordPress, Feedburner and sitemaps

UPDATE (19/03/2011): it seems the last version of the plugin already takes care of Googlebot, so this post has to be considered outdated.

If you use Feedburner for your wordpress feed, you probably use the FD Feedburner plugin for WordPress . The plugin is cool because it redirects your users to your Feedburner while letting Feedburner itself accessing your wordpress feed; and it’s really simple to configure.

But if you want to submit your feed to Google Webmaster Tools, Google will be redirected to your Feedburner too. While you may expect it to work, in some case it won’t. If you track clicks on Feedburner in fact, your feed will have changed links in it. Feedburner changes the <link>URL</link> to an internal URL that will redirect to your own URL after tracking stuff.

As a consequence, if you submit your feed as a sitemap on Google Webmaster Tools, Google will show you errors like this:

Feedburner and Google sitemap

This happens because the URLs in the Feedburner feed are not into your own domain but on http://feedproxy.google.com/

To fix this behaviour, easiest solution is having Google accessing your original feed (http://yourblog.tld/feed/) instead of being redirect to Feedburner. This can be easily done with a little change in the plugin.

Edit your plugin (with a text editor accessing the file via ftp, or just from the dashboard -> Plugins -> Editor, and select the FD Feedburner plugin) and look for this piece of code:

function feedburner_redirect() {
global $feed, $withcomments, $wp, $wpdb, $wp_version, $wp_db_version;

// Do nothing if not a feed
if (!is_feed()) return;

// Do nothing if feedburner is the user-agent
if (preg_match(‘/feedburner/i’, $_SERVER['HTTP_USER_AGENT'])) return;

// Do nothing if not configured
$options = get_option(‘fd_feedburner’);
if (!isset($options['feedburner_url'])) $options['feedburner_url'] = null;

Just change the line

if (preg_match(‘/feedburner/i’, $_SERVER['HTTP_USER_AGENT'])) return;

with:

if (preg_match(‘/(feedburner|google)/i’, $_SERVER['HTTP_USER_AGENT'])) return;

and you are done. Google won’t be redirected to your Feedburner feed, and it will use your original feed as sitemap.

Posted in Blog, Google, Software | 2 Comments

Nice try

Derek Powazek – Spammers, Evildoers, and Opportunists. This is a kind of link baiting tactic i don’t like too much: the attack hook.

This is why, for once, i’m using a nofollow attribute.

Posted in Seo | Tagged | 1 Comment

Bigdump

Moving a website to a new hosting, i had the problem of importing the database. In fact, export was too much big (30Mb) compared to allowed size of upload files via phpmyadmin on new hosting (1Mb – where “M” maybe stands for “miserable”). Of course, no shell access…

So? Fortunately, i found BigDump, a GPL script that allows to import into the new database the exported file. Excellent!

Posted in Software | Tagged | Comments Off

Spam of the day

Just got yet another spam. But this one reminded me of a funny quote.

Email’s subject was: Millions of customers can’t be wrong! (and then usual stuff about penis enlargement bla bla bla ;) ).

Well, this is what immediately came in my mind:

(image from Cafepress)

Posted in Blog | Comments Off

Free IQ Test

IQ Test
Free-IQTest.net – IQ Test

Posted in Link love | Comments Off

Wolda 2008

Hungary Logo Eulda 2007

This one is my favourite logo from Eulda 2007, European Logo Design Annual. Now, Eulda has gone global, becoming Wolda: Worldwide Logo Design Annual.

As you can see on Wolda website, you can submit your logo for Wolda ’08: any logo printed, published or visible online between January 1, 2007 and December 31, 2007 is eligible.

Posted in Link love | Comments Off

Web Analytics Jokes

On 18th Dec 2007 Avinash Kaushik, author of Web Analytics: an hour a day, wrote a post intitled Web Analytics Demistified.

On 7th Jan 2008 Eric T. Peterson, author of Web Analytics Demistified, wrote a post intitled Web Analytics: an hour a day.

Posted in Web Analytics | Comments Off