Skip to content

Troubleshooting

iPhone – Can’t access the internet while roaming?

If you can’t access the internet while roaming outside your home country the issue may be down to a new switch enabled in iOS8.

It was supposed to allow roaming without any extra charges in any EU country but it doesn’t do exactly what it says on the tin…

Find and disable “EU Internet” in your iPhone mobile settings and this should allow you to access the internet again. Or you could just use an Android phone…

Chrome slow or failing to connect to g-mail and other sites

We had problems in work with Chrome intermittently failing or slow to connect to G-mail and other sites. I found this is down to an experimental QUIC transport layer protocol in Chrome.

Finding and disabling QUIC in Chrome by typing:
chrome://flags
into’s Chrome’s URL bar and hitting return should have you browsing the internet the way Chrome originally intended…

It doesn’t seem to be a problem if you have a direct connection to the internet (home user) but if Chrome’s throwing hissy fit’s when connecting to sites it’s worth a try disabling it.

You may also need to set the flag again if Chrome updates in the background or you update manually.

Intermittent port outages (link flaps) on Cisco 4500 series switch

Here’s a problem I initially suspected was being caused by our Cisco 4500 series switch or the software rev on the switch but it turned out to be something completely different.

Here’s the background of the problem:

Two of our office users started to have problems with their PC’s intermittently losing network connectivity for about 10 sec’s at a time, at which point the PC would reconnect to the network and behave perfectly for another couple of hours before disconnecting again.

Our new switch had been installed about a year or so ago with all new Cat6 cabling feeding through to our Comms room patch panel (or so I thought). Our PC’s had been behaving fine for a number of months and connecting fine at GB speeds.

The cabling from our patch panels out to our office floor ports is all Cat5e Gigaspeed certified.

The fact that two users were having the same problem was strange and initially had me worried as it was doubtful this was a problem with the OS build otherwise as lot more users would’ve reported the issue. It was also unlikely this was a virus as the likelihood of two PC’s being infected with the same virus at the exact same time was low.

So I started to dig some more – here’s what I found:

I was lucky to in a position where I was able to catch the network disconnect (as I’m calling it for now) on one of the floor boxes affected, at which point I unplugged the affected PC and plugged in a laptop.

The laptop displayed the same behaviour as the desktop PC – no network communication and then after a few seconds it came to life again.

This is what I was looking at on the PC’s after they experienced an outage:

Event Viewer Error Message

It turned out a number of other users where having the same problem but hadn’t noticed the issue as the network outage was so short.

The above got me worried as the issue looked to be switch based – either a problem with the line card or the software rev on the switch itself.

At this point I contacted our Network team and laid out what I found.

Problem was they couldn’t find any port outages recorded on the switch. They suggested we were having a problem with the NIC card/PC hardware and suggested that manually enabling 1GB on the NIC card and switch port should fix the problem.

I had a problem with this suggestion however as it did not resolve why my laptop would not receive traffic (not even a link light) when it was plugged into an affected floorport. Manually enabling 1Gb on the switch and PC is also not best practise.

So a lot more Event Viewer digging on a lot more office PC’s revealed the same network outage conditions with the same event viewer error message reported. I was also able to find network outages on a Mac so this was highly unlikely a OS build or hardware issue on our PC’s

The Network team where still unable to see any outages on the switch so they weren’t likely to dig further until I suggested they check switch logging is actually switched on and configured to detect disconnects and check the software revision on the switch for any bugs that match the disconnect issue…

It was at this point they found out that link state messages were disabled in the image they were using on the switch. After they got link message logging enabled they were finally able to see the port disconnects which they found to be down to link flaps.

The reason they couldn’t see the drops wasn’t down to what I thought – but I took this as a win as they were now able to see the port outages…

After much back and forth around the possible causes of issue including bpdu packets and questions about the  level of logging the Network team had enabled I then began to investigate our network cables as the cause even though I didn’t think it was possible for a cable plugged into a switch to cause a port to disable itself completely for 10 secs at a time. Especially when the PC connected to that cable behaves itself 99 percent of the time…

An incorrect assumption – but you learn something new everyday!

My checks indicated all the cables were rated for GB speeds until I found a bunch of cables that stood out from the others purely because they did not have “Gigaspeed” written at the end of the cable identifier. The cables in question were Systimax 1074D 4/24 (UL) and they ran from the switch to the patch panel.

It turned out almost all the cabling running from the switch to the patch panel had been replaced with Cat6 but not all. Some Systimax 1074D 4/24 (UL) cable – about 15% of the old cable running from the Cisco 4500 switch to the patch panel was still in place…

I couldn’t find any information about the Systimax 1074D 4/24 (UL) on the internet so I had no clue if it was Cat5, 5e or Cat6 certified. Thankfully a great guy called Alaric Jenkins from network cabling supplier Info Stor responded to my query and was able to tell me that the Systimax 1074D 4/24 (UL) cable is Cat5e but not certified for Gb ethernet.

So here’s what I learned:

1. Just because you’re able to get GB speed on your PC using a Cat5e cable 99% of the time – that doesn’t mean your Ethernet cable is GB certified…

2. Cabling can adversely affect your switch ports, even going so far as to completely disable them.

Convert Sony HD422 MXF files to MOV

This one had me stumped for a while…

A Pro Sony HD422 camcorder was purchased by one of our newspaper titles for interviews, shoots etc – problem was no one knew how to export the proprietary MXF format videos the camcorder records to a format that’s compatible with Final Cut Pro as nobody on site had any pro video camera experience…

I’m sure a lot of first time videographers have had trouble with this particular camera so here’s the absolute minimum requirements you need to get those MXF files into Final Cut Pro for editing:

I went with FCP 6 (updated to 6.0.6) as we already had a licensed copy of it available, I just wasn’t sure if FCP 6 could handle wrapping the MXF files as MOV until I found the FCP 6.0.3 release notes

The prerequisites are:

1. Final Cut Pro 6.0.6 (FCP6 is available as part of Final Cut Studio 2.0 once that’s installed an update using Apple software takes FCP up to 6.0.6)

or Final Cut Pro 7 (Install all latest updates available)

2. XDCAM Transfer 2.16. Available from the Sony website (PDZK-P1_XDCAM_Transfer_v2_13_0.zip)

The Final Cut Pro update from 6.0 to 6.0.6 installs the codecs necessary for XDCAM Transfer to read the XDCAM/HD266/MXF files the Sony HD422 creates. Once the codecs are installed XDCAM Transfer can wrap the MXF files as MOV for Final Cut Pro to read and edit if you so wish. At that point you can then export to whatever video format you want.

Once the above has been installed launch XDCAM Transfer and open up the MXF files stored on your SXS card or stored locally on your Mac and click on the “Import” button down the bottom right side of the app.

From there browse to “\Users\<username\Movies\Sony XDCAM Transfer” and you’ll find sub-folders containing the MOV wrapped MXF files which can then be opened in FCP. The only caveat we have here is that FCP6 will not install on OSX 10.8 but you’ll be fine on 10.6

Make sure Sender Policy Framework (SPF) is correctly configured

We had some issues at work recently with intermittent e-mail’s from a third party not arriving in our Google App’s mailboxes (we’ve now migrated to using Google App’s for Business as opposed to using Exchange exclusively).

This was a big problem for us as these e-mail’s are an important source for news stories.

I immediately suspected spam/DNS server issues due to my previous experience trouble-shooting our Ironport issue two years ago.

This time however the problem lay with the third parties DNS not ours.

Here are the steps I took to troubleshoot then fix the problem:

Symptoms:

The issue was first reported when one of our news reporters noted some e-mails were not arriving – sent in from this news sources distribution list.

Some e-mail not all were being dropped – this is an important factor as we’ll see below.

Troubleshooting steps:

1. I realised straight away the problem wasn’t related to any e-mail filters in the reporters inbox or SMTP blacklists since approx only 20% of the mails were not arriving. Nonetheless I ran the check’s necessary and found nothing causing mail to be filtering into the users bin.

2. I checked in on another colleague who’s mail domain address is different from the news reporter that detected the problem (We use multiple domain names for separate titles and business teams). That colleague indicated they were missing the exact same e-mails that the first reporter detected missing.

3. The distribution list in question is used by a number of other news organisations so I sent a mail out to another news organisation’s IT dept to ask if they’d noticed those particular e-mails missing – they hadn’t.

4. I then asked an ordinary user at the news sources site, (I’m calling her [email protected] here) to send me an e-mail so I could evaluate the incoming e-mail headers for anything out of place. The e-mail she sent arrived fine but the information I found in her mail header provided the light bulb moment when taken into account with the rest of the information gathered. I found two key entries in [email protected]‘s mail header:

Received-SPF: fail (google.com: domain of [email protected] does not designate 137.191.225.35 as permitted sender) client-ip=137.191.225.35

Authentication-Results: mx.google.com; 

spf=hardfail (google.com: domain of [email protected] does not designate 137.191.225.35 as permitted sender) smtp.mail=137.191.225.35;

It was time to brush up on SPF or Sender Policy Framework

SPF basically boils down to a DNS entry that indicates which SMTP servers are permitted to send e-mail on behalf of a mail domain. This prevents spammers from getting e-mail’s into a users inbox since spam prevention e-mail gateways like Ironport and Postini will perform a DNS verification check each time a particular SMTP server tries to deliver an e-mail purporting to be from a particular e-mail address (domain).

As an example say a spammer tried to deliver an e-mail to [email protected], by spoofing an innocent individuals e-mail address (let’s say [email protected]) and used their spamming SMTP server to try to send [email protected] a spam e-mail. The Ironport/Postini/Spam Gateway when it’s first contacted examines the mailing domain indicated in the e-mail address of the message – say “hotmail.com” and the accompanying IP address of the spamming SMTP server.

Ironport/Postini will then contact the DNS server (which should always be accessible on the internet) for hotmail.com  to check that DNS Server’s SPF record. If it see’s that the IP addresses listed in the SPF record doesn’t match the IP address of the SMTP server that just tried to deliver an e-mail purporting to be from [email protected] it will drop that e-mail delivery attempt to [email protected], thereby preventing the spam reaching [email protected]‘s inbox. According to hotmail.com‘s DNS SPF record the spamming SMTP server trying to deliver a message purporting to be from [email protected] is not authorized…

Now back to our problem – So our news organisations Postini gateway was designating the IP address of our third parties SMTP server 137.191.225.35 as not being permitted to send e-mail on behalf of that user.

I had two problems with that –

1. First why did the SPF fail?

2. The SPF failed but [email protected]‘s e-mail got through anyway.

It was time to probe our news sources (xxxx.ie‘s) DNS configuration. In the process of doing so I discovered it was possible to expose the SPF configuration for DNS server’s available on the internet using this tool

So I plugged in the IP Address of the 137.191.225.35 SMTP server designated not authorized in [email protected]‘s e-mail header.

The Beveridge SPF Test tool then exposed the following SPF configuration for the xxxx.ie domain (some IP addresses changed for security purposes):

v=spf1 mx ip4:137.191.xxx.x1 ip4:137.191.xxx.x2 ip4:137.191.xxx.x3 -all [TTL=86400]

But I also noticed subsequent SPF test’s also returned a different SPF configuration below:

v=spf1 mx ip4:137.191.xxx.x1 ip4:137.191.xxx.x2 ip4:137.191.xxx.x3 ip4:137.191.225.35 ip4:137.191.xxx.x5 -all

So what was I looking at?

Basically one of more DNS servers for the domain xxxx.ie which our Postini gateway was performing SPF checks against had an incorrect/out of date SPF configuration. The second SPF configuration was the correct one as it listed 5 SMTP servers that were authorized to send mail on behalf of the xxxx.ie domain

The first incorrect SPF also should not have contained the  “[TTL=86400]” reference, as this doesn’t conform to SPF standards.

Cross-referencing the IP address of the 137.191.225.35 SMTP Server in the e-mail header I found that 137.191.225.35 matched a missing IP address in the incorrect SPF record. 137.191.225.35 was present in the second SPF record but not the first.

So our Postini was hitting the DNS server with the incorrect SPF configuration about 20% of the time and therefore dropping 20% of the e-mails sent because it was determining that 137.191.225.35 wasn’t authorized to send e-mail on behalf of xxxx.ie 20% of the time.

I’d figured out the issue was was due to an incomplete SPF record on one xxxx.ie DNS server but why did the e-mail from [email protected] deliver though it was given an SPF failure while some e-mails from the distribution list [email protected] were not being at all delivered?

My guess is that our Postini gateway even though it flagged [email protected] as being an SPF failure – SPF checking for Google/Postini is evidently not that strict. There must be additional spam filtering criteria in place on our Postini that flags the [email protected] e-mails based on message content, that together with the results returned from the problem DNS server flagged the [email protected] as being spam 20% of the time. Our Postini services are provided by an outsourced company so I don’t have access to the specific spam filtering criteria to verify this.

Fix:

I contacted the IT Security department manager for the xxxx.ie, who confirmed my findings and isolated the problem DNS server. The DNS server in question was not under his dept’s direct control but was managed by another dept – I’m guessing for redundancy purposes.

He logged a change to have the SPF details applied to the problem DNS server which should eliminate our problems receiving mail from any @xxxx.ie addresses in the future.

Hopefully this has given you an insight to the complex world of spam filtering….

Who’s best for WordPress hosting GoDaddy Vs Bluehost

Well – I’d finally had enough of GoDaddy’s incessant time outs which they’d attributed to common run of the mill plugins like JetPack and others.

I’d installed their recommended caching plugins to try and improve page load times and even installed their own proprietary P3 profiler plugin to measure my blog stats but it didn’t tell me anything I already knew, that my tiny wordpress blog was responding like a fly stuck in a jar of honey. Granted I’m on a shared plan and I take it as a given that the page load times won’t be instant on a shared host – but page time out’s and ultra slow load times I will not stand for.

So it was time to up sticks and move to another provider.

My main criteria for moving was that the host have great performance and support for WordPress.

There’s a ton of conflicting advice on the internet about the best WordPress host – believe me, I’ve researched this for a long time so I decided to go with the advice from the horses mouth – WordPress itself, and Bluehost it was. Look they’re right up the top, like the gold winning champ they are and it’s been a huge improvement – the time outs and slow page opening times have been eliminated, even though I’m still using Jetpack, – isn’t that strange GoDaddy?

I’ve also found out the reason GoDaddy’s shared hosting can’t cut the cheese – they don’t throttle down “abusive” users like Bluehost do with their shared hosting users.

My advice – go with Bluehost for your WordPress blog, it’ll save the hair pulling and back pain of transferring your blog from GoDaddy when you realise they’re not up to the task. It was a great learning experience, but not something I’d recommend if you don’t have a technical background.

Bluehost

JetPack disappears after installing version 1.9.2

I has some problems updating to the latest version of Jetpack today – version 1.9.2. Normally the automatic install from within the Dashboard carries off without a hitch but today I had to dig into my GoDaddy FTP host storage to resurrect my Jetpack statistics.

I started off the automatic install but at the point where the new Jetpack files are downloaded and unzipped the installation looks like it goes nowhere.

I attempted to kick off the install again and left it over night to see if that would jump start Jetpack but when I checked my Dashboard this morning I found the admin GUI and all my stats had been erased from the Dashboard like it had never been installed – not good.

So I decided to launch my GoDaddy Account Admin FTP Manager – that wasn’t playing ball either, returning a spinning circle when attempting to read the wp-admin wp-includes and wp-content folders.

Time to break out the big guns – luckily I’d already installed Filezilla and knew the in’s and outs of connecting to GoDaddy’s FTP, what I found when I connected to my shared FTP was a jetpack.tmp folder located within wp-contentupgrade. I was wary of deleting this folder but there were no config files lower down in any of this folders sub-folders so I guessed the update process aborted itself for whatever reason and went ahead deleting jetpack.tmp

All Jetpack stats are stored at wordpress.org so after another Jetpack 1.9.2 automatic install from the Dashboard, I got them all back – job well done.

Latest posts will not appear on my WordPress blog

I had some problems with my WordPress blog not displaying new posts unless I was logged in as an admin.

Fine for me but a major problem for readers of my blog – since they all can’t be logged in as admin’s 😉

I found the problem started once I updated to WordPress 3.4.0.

If you use WP Super Cache 1.1 try de-activating it as the new posts showed up straight after.

Don’t get me wrong – I love  WP Super Cache as it vastly improved my blogs performance (hosted with GoDaddy) by eliminating a lots of page time-out errors a few months ago (I’m on a shared plan and performance isn’t a priority for me yet).

I’m not going to say the above is specifically an incompatibility between WordPress 3.4 and WP Super Cache 1.1 as I don’t have a very common hosting configuration (so another factor could be involved) but for me – that’s what worked.

If it works for you please get in touch with Donncha the plugin developer.

I’m going to leave WP Super Cache disabled for now as I’m not seeing as many page time out problems with GoDaddy as I did a couple of months ago but I’ll certainly be looking at WP Super Cache 1.2 when it gets released.

PC will not appear in my SCCM OS deployment collection

Here’s one we were having problems with a couple of months ago:

My colleague after registering two PC’s in “Operating System Deployment – Computer Association” and making them members of our “OS Pre-Staged Computers for Deployment” collection found there was absolutely no trace of either PC in the “OS Pre-Staged Computers for Deployment” collection when it came time to assign these PC’s to their required OU and role in order to kick off deployment.

It was a very odd situation as the PC’s had been purged from SCCM and AD before being re-registered for a bare-metal build scenario.

After going back and registering the PC tags and corresponding MAC details in “Computer Association”  I couldn’t find any trace of either PC in the OS Deployment collection even after “Updating Collection Membership” and refreshing the collection. Note re-registering PC details if it doesn’t register in your OS Deployment collection the first time probably isn’t that great an idea (duplicate entries in your SCCM database records) – but I was running out of options at that stage myself.

A double-check of the PC tag and the MAC address the PC was returning from a paused F12 PXE boot confirmed he used the correct details for the machines in question and I knew the PC’s had been removed from AD and SCCM but just for a hoot I double checked and found nothing. The PC’s could boot into F12 fine and pick up a valid IP address so this had me stumped.

After spending quite a bit of time trying to figure out why the PC tag was not registering in SCCM – I decided to perform an nslookup of the PC tag we were trying to associate in SCCM. It was at that point I noticed the IP address returned by nslookup was different to the ones assigned by PXE/DHCP boot for both PC’s, my conclusion was that DNS had not been doing it’s job properly purging old DNS registrations.

I asked our Security team to purge any old records of DNS registrations that matched the ones we were trying to build and about an hour later after re-registering the PC’s in the Computer Association node they finally appeared in our OS deployment collection and I was able to assign the PC to the correct OU and role.

Problem with Direct Access and Vodafone 3G/HSDPA – Update. SEP is the problem

I found the source of our Direct Access problems back in May during a lull period at work. Sorry for not updating then guys…

Here’s a link to my previous Direct Access post

I had a suspicion our anti-virus may have been causing our Direct Access problem so I went ahead and removed Symantec Endpoint Protection using CleanWipe which you can find here

I’m not that big a fan of Symantec as we had problems with our previous generation of PC’s – we suspected it was causing random out of the blue power downs of some PC’s while users were working. We could never prove it though as no event viewer logs were recorded just before the PC’s powered down unexpectedly. Luckily we’ve not had any problems of this type on our current generation of PC’s – they do have a new version of SEP installed though! In hindsight I should have tried removing SEP to test with Direct Access sooner in light of my previous experience with Symantec.

Anyway – Back to Direct Access. Once SEP was uninstalled with MS Security Essentials replacing it – Direct Access started behaving as it should on our laptops over a HSDPA/3G connection, and I didn’t have to run the “netsh interface 6to4 set state disabled” command  indicated in my previous post anymore.

I then decided to do  some more digging on the subject once I realised SEP was causing the problem and found this little nugget from a Symantec forum post.

So if you’re thinking about deploying Direct Access and have SEP deployed you have three choices:

1. Wait for the SEP update to come out in August 2012.

2. Uninstall SEP and replace with an antivirus that works with Direct Access (might be workable if you’ve only a small group of users that need to use Direct Access) i.e. replace AV for that group and leave the others with SEP.

3. Create a batch file that will run the “netsh interface 6to4 set state disabled” on start up on each laptop.

Hope this helps guys

 

The Blog of Martin Birrane