The Engadget Interview: Michael Robertson, CEO of SIPphone

For this week's Engadget Interview, veteran journalist J.D. Lasica spoke with SIPphone Inc. CEO Michael Robertson about his startup company's battle against Skype (which was purchased by eBay earlier today), the Gizmo Project, open standards, and the coming era of always-on, always-connected voice communication.

Let's start with a quick backgrounder on SIPphone.

Michael Robertson

SIPphone was started two years ago in San Diego. The goal was to bring Voice Over IP to the mass market. But more than VoIP, the goal was to emphasize SIP, which is an open standards signaling protocol. The goal was to push voice to be more like email and less like instant messaging. With email, you have an email address and you can email anyone in the world and they can email you back. Contrast that with IM, where if you're on MSN I can't instant-message you because you're on Yahoo or someone's on AOL.

At SIPphone we do several things. We run a directory, what people in IP land call a proxy server. This is the server that connects two people. We help SIP hardware manufacturers - these are routers, adapters, and even wi-fi SIPphones. When you buy one of those devices and you plug it in, it has to connect to a directory so it can connect calls and give you a dial phone, so we work with them to make those auto-configure.

And very recently we released software that uses our directory called Gizmo Project. That's in Mac, Windows, and we just added Linux last week.

For people who aren't familiar with your business background, why don't you give us the nickel tour?

Most people know me from, a company I founded and took public and eventually sold for about $400 million. We built the largest digital music company and we were profitable when we sold the company. People sort of forget now, but when I started, RealAudio had an 85 percent market share, and everybody said RealAudio was the future and Microsoft had just come out with Windows Media, and other people said Microsoft would dominate. I was one of the few voices that said, mp3 is the way to go. The media companies all hated it. We went to court to defend the first mp3 player. So I hope people think of me as someone who stands for open standards, open directory, and consumer-friendly technology.

After, I founded Lindows, a desktop Linux company now called Linspire. I stepped down as CEO earlier this year to focus on SIPphone and MP3tunes, which is a digital music company just getting started.

What attracted you to SIPphone?

Well, I look at voice as being one of the last untapped frontiers available on the Internet. If you look at email, search and instant messaging, those are largely claimed territories. I think voice is an unclaimed territory. That's what compelled me to start SIPphone.

What do you make of today's announcement that eBay will purchase Skype?

eBay buying Skype is great to raise the awareness for VoIP, but both companies have closed directories and APIs that only let people access data on the periphery. Both of their "open-platform" initiatives are highly restrictive and do not promote the best experience for the user. The world does not want a closed commerce and communication experience owned by eBay. The world wants a product that can call anyone, anywhere without paying a toll to a gatekeeper. Only time will tell if eBay chooses to truly open Skype up to the world, but I'm not holding my breath.

Tell us more about SIPphone. It's a Voice Over IP startup. How fast are you growing, and how do you differ from Vonage, 8x8 and other VoIP offerings?

There are two camps in the VoIP world. There are the traditionalists, and there are the disrupters. The traditionalists are fairly classical in their approach. They sell you traditional-looking, -feeling phone service with the difference that the call goes over your own broadband connection and setup. In that camp you'd put Vonage and Packet8 and AT&T's CallVantage. So maybe you'd save a few dollars over your traditional phone service, but for the most part it operates the same way.

The disrupters are companies like Skype and SIPphone, and we believe that all calls are going to zero. Meaning that, if I had a service where you'd have to pay for every email that you sent, and I was going to charge you more if the person you were sending an email to lived farther away, you'd think I was crazy. But that's how we still think of phone calls today, even though they're just bits of data traveling along the pipes.

At SIPphone we believe all calls are eventually going to zero, to we have to think of different ways to make money around the voice experience vs. just charging per minute and charging more if they fall outside arbitrary city, state, country lines. When calls become free, it doesn't mean you won't pay for the broadband connection or the wireless connection, but the actual phone call itself is going to free. New companies in VoIP will have to think about new ways of making money beyond charging per call.

Who are the major players in this space?

If I had to handicap, I'd say Skype No. 1. They can claim 50 million users, but the real number to look at is how many users are online at any given time. That's between 2 and 3 million, so they have a good lead, but remember there are 600 million people on the Internet. I'd probably put Yahoo in the No. 2 spot — they just rolled out voice with their instant messaging network. I think you have to watch Microsoft because of their distribution. You have Google, whose first product is very rudimentary and basic but they'll get better and better. AOL may eventually get it right. And you have Gizmo Project, of course.

In the next six months, you'll have some big leaders emerge. So if the smaller players aren't out there getting some attention and building that installed base, I think it'll be too late to catch up.

What does a new SIPphone customer need to make this work?

One thing is to buy a piece of hardware that gives them a phone-like experience. If you go to, you'll see a big list of devices, and they can plug in a regular phone and make calls to anyone on the SIPphone network for free. If they want to call a traditional phone, they would buy a calling card experience. Then there's software they can download over the Internet and make calls PC to PC totally free, and if they want to call traditional phones, they can buy calling card minutes.

At SIPphone we believe strongly in an interconnected directory. We want to connect our phone directory with anyone and everyone else in the world. Last week Google announced their Google Talk PC to PC calling software, and they announced they would connect their network with our network, so anyone who's on Gizmo Project can talk with anyone on Google Talk, and vice versa.

We have another program called GUPS, for Global Universal Phone System. We've linked our phone directory with universities around the world: UCLA, UC San Diego, BYU, MIT. Today, on Gizmo Project, you can call any phone on the university campus network for free. That's our goal, to interconnect and slowly erode that pstn [public switch telephone network] and pull more free telephone numbers into the VoIP world.

How many customers does SIPphone have?

We have about 100,000 people who've purchased a hard phone — a router or an adapter, that lets you plug in a regular phone. We have about 150,000 soft phone users — people who've downloaded software that connects to our network.

Project Gizmo

Is the Gizmo Project open source?

It was written by developers at SIPphone, so it's not open source. We licensed some technology called GIPS that allows you to deal with core network-quality environments. It's the same technology Skype uses.

Skype just turned 2 this week, and it has 50 million customers, so why would someone choose Gizmo Project instead?

Two big reasons. One, Gizmo Project is open standards based. We'll interact with everyone in the world. With Skype, you can connect only with others who use Skype, and that's not a world that I want. I don't want the world to go like IM, where MSN and Yahoo have their little protected silo of users. I want a world where anyone can call anyone else. Because they're proprietary, there's no WiFi Skype phone today. There's a WiFi SIPphone. So you're beginning to see the world get behind SIP, much as they did with MP3. There wasn't just one company innovating. So people have to decide, do you want a closed world or one that interoperates with everyone else?

Now, there are some feature differences between Skype and Gizmo Project. On the plus side for Skype, they have instant messaging, and we won't be putting out our beta version for another week or two. What we do have is call record, which is great for podcasting. We have unlimited conference calling, free voice mail, call mapping, sound blasts, which is the ability to send sound. So there are a lot of features where we trump Skype. But there's something fundamental behind the scenes that makes us different.

One of the things Skype does, which I think is petty sneaky, is to use your computer to route other people's telephone calls. They could be using your processor and your bandwidth as a supernode to move other people's calls, and that's something that we don't do. We have servers and data centers around the world that relay calls, and we never use your computer for other than your own call. So I think this has made Skype fairly unpopular with enterprises that guard their network resources, and some users have noticed real slowdowns because their computer moves calls for other people. You can't turn it off, too.

Interesting. I have Skype and I wonder if that's why I sometimes have a sluggish connection.

If you open up your Windows task manager, you'll see the CPU taken really zoom up, and that could be because you're routing other people's calls. It happens most often if you have a static IP address.

How does your network's voice quality compare with Skype?

We're both using the GIPS [Global IP Sound] technology — Google licenses it too — and it's a very sophisticated technology that handles network jitter and sees how much bandwidth you're using and adjusts it if you're on dial-up, for example. So the call quality is going to be comparable between the two.

The one advantage we may have is that you don't have a third party you're relying on to route your calls, like Skype does. We've deployed routers and data centers around the world: Hong Kong, France, San Jose. So we might have a small call quality advantage because we operate these supernodes. But honestly, the call quality will be pretty much the same.

Can Gizmo be used with cell phones?

You can of course call cell phones, if you buy call-out minutes. We sell call-in and call-out. Call-in is when you get a real telephone number that is assigned to your Gizmo Project, so wherever you are on the Internet, someone can call and your computer will ring. We support U.S. and U.K. telephone numbers today.

Call-out is the ability to call regular phones like land linds or mobiles. We don't offer a PDA version today. We're investigating it, we haven't started development yet.

On the interoperability question, on Gizmo today you can't call someone who has Skype. Is that primarily a technology issue or a business issue?

It's a business issue in that Skype has refused to connect their network with anybody else's. If you go to, you'll see a giant list of VoIP companies that we connect with today. Skype doesn't connect with anyone, and because they're big they think they can get away with it.

There's no technical reason why you couldn't build a gateway, even though Skype doesn't use SIP, they use their own proprietary thing. You'd have to build a translator or gateway, but there's no reason why it couldn't be done.

Tell me more about your business plans. How do you make money?

I think long-term you'll see sort of a Google model emerge around VoIP. Today on Google, ads come up and you click on the ones that look interesting and they charge the advertiser a few pennies. Think about that same model for voice calls. Say, on Gizmo Project, you'll say, I'm hungry for some pizza. I type in pizza and I'm immediately presented with, do you want to order from Domino's, from Shakey's, from Papa John's? If I click Domino's and order from them, perhaps they'll pay 35 or 50 cents or a dollar. We're not doing this today at Gizmo, but I think this is one of the ways the revenue model around voice will emerge going forward.

OK, I won't make any pie-in-the-sky jokes. Who do you see as your typical customer?

Who talks on the phone? Ultimately, Voice Over IP touches the entire world, very similar to music. It's not just 13-year-old girls or 24-year-old men who like videogames. Everyone uses their telephone. Now, clearly there are early adopters who like to embrace new technology before it moves to the wider demographic. But I will tell you a story.

I went to one of the large carriers and I told them, you won't believe the explosion of PC to PC calling we're seeing at SIPphone. He said, You know, geeks might talk to each other on PCs, but I don't see normal people using their PCs to make calls. And I was struck by a very similar conversation I had about seven years ago with the record labels when I said, You won't believe what's happening, people are listening to music on their computers. And I remember them saying, well, we don't think the average person is going to listen to music on a computer. And you tried to explain to them, well, look, you can do so much more with a computer — you can organize your music and make playlists and burn CDs. They couldn't get it.

I think we're in the same era today where when you talk to the traditional phone guys about PC to PC calling, they're really skeptical. They just don't understand the power you have when the call is going through your computer, to record it and have a five-way conference call and have sound effects or send images along — this is really what's going to get everyone excited about PC calling.

Tell me more about those features — Map It, record any call. What are some of those about?

Well, the feature that is most used by consumers is the call record feature. You hit a button and it will say, Now recording the call, and it saves it to a WAV file. This is what a lot of people are doing to create quick and dirty podcasts.

With the free conference call service, you can have multiple hosts call in and record a podcast. There's a great eye candy piece called Map It, and it will show in Google Maps a location of who you're talking to. So if you're not sure who you're talking to or if they are who they say they are, you can click that and find out.

I think the most fun feature is the Sound Blast. These are sort of like smilies are to instant messaging. We ship with six or seven sounds, like cars crashing. But you can add any WAV file you'd like to your Gizmo Project. It might be your favorite band or DJ or movie sound effects and play them in the middle of calls, and people go, how the heck are you doing that?

You can do your own morning zoo radio show.

Exactly right. You can add the gong sound or gun shots or anything you'd like. That's another thing you're starting to see on podcasts. Record your podcast, add six or seven sound effects, hit stop recording and you've made a show.

What final steps do you need to finish your podcast?

We save the call in a WAV format, and you typically want to convert that to mp3 and upload it to wherever you host it.

Why not just save the call to MP3?

We haven't licensed MP3 for Gizmo Project yet.

You can also send files and images with Gizmo?

This is probably the No. 1 feature that people have asked for. Our instant messaging is based on Jabber, an open protocol that allows us to interoperate with places like Google Talk. So you'll see us implement Jabber instant messaging and file transfer in upcoming versions.

Can anyone in a WiFi cloud make free Internet voice calls to other SIPphone users?

Absolutely. And voice mail is another feature a lot of users like. Skype charges for it, but it's free on Gizmo Project. If you're running Gizmo and you missed a call, it actually comes to your email as an audio attachment.

Just to make it clear, you can download the Gizmo Project application and use it free without needing an accompanying SIPphone or other hardware.

Right. You just download Gizmo from and there's a quick sign-up project, without a need for credit card or billing information, and you're making free calls. If you want the highest quality calls, you can buy a 10-dollar headset that will plug right into the mike jack on your desktop or your laptop and then you have a really high-quality call.

My favorite device is an mVox. This is a small device, about the size of a deck of cards, that plugs into your USB port. It is a speaker phone of extraordinarily high quality, and this removes the need from having to plug in a headset, but the person you're talking to won't know you're on a speaker phone because of sophisticated echo cancellation and things like that.

What are your personal calling habits? Have you stopped using traditional phones?

When I'm at my desk, I use Gizmo Project. I still use my mobile phone when I'm driving in the car. But I almost never use my office line or home line anymore. We're working on ways that will combine even your mobile experience with Gizmo Project. We can't really talk about that, but I think you'll see your mobile phone and your PC phone merge over time.

What kinds of features or functionalities do you see emerging in the next few years?

Right now, phone numbers are tied to locations. You have your home phone, which rings your house. You have your office phone, which rings your office. You have your cell phone. I think in the future you'll have one number that rings your person. So JD's going to have a telephone number and JD's gonna be able to decide, do I want this to ring my PC and my mobile, or my PC and my mobile and my office, or move the call around.

Consumers will have one number and they'll be able to take that call on their PC, on their mobile phone, on their home phone, or move it around. So if I'm talking to you on my computer and I decide I need to get in my car to commute, I can transfer that call to my mobile call. That's what's missing today from the calling world. Voice Over IP can let you decide, is this person important and do I want to take this call on my PC or my mobile or my land line or wherever.

Even further out, how do you see the landscape for voice communication and telephony changing? Is this going to be ubiquitous and everywhere, spelling trouble for the traditional phone companies?

Yes, the traditional phone companies will clearly be in trouble. It's no secret that they're losing 20 to 30 percent of their revenues every year. Vonage, the darling of the VoIP industry whose business does not make sense in any way, they had to lower their rates 30 percent last year.

In five to 10 years from now, there's no question that all calls will be free. There won't be this notion that I'm paying per minute or I'm paying more to call this guy than that guy. Calls will be included with your wireless service or broadband pipe. Businesses like Vonage that are charging people $35 per month for their calling, those businesses are dead.

What will that do to society if we're going to have an always-on, always-connected society where anybody can call anyone else at any time. Will that explode the amount of communication taking place?

Absolutely. Already we're seeing today where people will open a voice call and leave it on as sort of a virtual office. They'll leave an audio channel open between two friends or colleagues and they'll pick up conversations whenever they'd like. 'Hey, Bob!' and Bob will go, 'What?' So the notion of picking up a phone call or hanging it up kind of goes away, and you just decide where you want to connect it to. We're already seeing people make calls that last 7 1/2 hours. They're not talking for 7 1/2 hours, they've likely got two speaker phones and they leave the connection open for when they want to talk. It'll definitely bring people closer by having virtual voice connections that just stay on 24 hours a day.

Anything else?

This field is so disruptive, because it's totally changing how people think about voice, and it's going to give people so much power that they don't have today. And it's really exciting and fun to be in this space. It feels very much like the early days of

J.D. Lasica's new book about the digital media revolution is Darknet : Hollywood's War Against the Digital Generation (Wiley & Sons).