Tuesday, August 21, 2007

Walking Home

View Larger Map

Massively multicore processor runs Linux Aug. 20, 2007

A startup founded by an MIT professor claims to have "solved the fundamental challenges associated with multicore scalability." Tilera's first products include a 64-core Tile64 SoC (system-on-chip), PCIe Express add-in board for networking and video-processing applications, multicore-optimized Linux libraries, and an Eclipse-based multicore development environment toolset.

Fabbed on 90nm process technology at TMSC, the Tile64 chip is "the first in a family of chips that can scale to hundreds and even thousands of cores," Tilera said. The company plans to bake a 120-core version on 65nm technology in the future, it added.

The Tile64 is based on a proprietary VLIW (very long instruction word) architecture, on which a MIPS-like RISC architecture is implemented in microcode. A hypervisor enables each core to run its own instance of Linux -- or other OSes, once they become available. Alternatively, the whole chip can run Tilera's 64-way SMP (symmetrical multiprocessing) implementation.

Tilera Tile64 architecture
(Click to enlarge)

The crown jewel of the Tile64 architecture is a network-like "iMesh" switching interconnect said to eliminate the centralized bus intersection that in previous multicore designs has limited scalability, according to the company. Tilera's founder, MIT professor and serial entrepreneur Dr. Anant Agarwal, has experimented with mesh-like chip interconnects since 1996, the company said.

Each of the Tile64's cores clocks between 600MHz and 900MHz; each has its own L1 and L2 cache. L3 cache is handled in an interesting way, as Bob Dowd, director of marketing, explains. "We take all the L2 caches and consider them in aggregate to be the L3 cache," he said. "It's highly effective, because if you reference your own cache, and don't find the data you're looking for, a neighbor may have it, and that's faster than going off chip to external memory by a good ways."

The Tile64 is implemented as a a system-on-chip (SoC) with no requirement for external northbridge and/or southbridge chips. This saves power, at the expense of locking in a specific peripheral mix, essentially tying the chip to specific verticals, according to the company. Dowd noted, "We did a lot of research, and believe we have the peripheral mix right for the markets we are targeting -- networking and video. If we went into storage with a new processor, we'd add fiber and disk drive interfaces."

Tilera claims that the Tile64 outperforms Intel's dual-core Xeon processor 10 times, while offering 30 times better performance per Watt. Compared to TI's top-of-range TMS320DM648 DSP, performance per Watt is claimed to be 40 times better.

Another touted benefit is the ability to consolidate control- and data-plane functions on a single device, with "solid-wall" processor boundaries reinforcing security and licensing containment barrier. In this regard, the Tile64 chip resembles another heavily multicore MIPS64 chip, Cavium's 16-way Octeon.

Software environment and tools

Tilera claims that existing, "unmodified" Linux apps will build for the Tile64 processor using the company's toolchain. The toolchain includes a compiler licensed from SGI and based on MIPSpro.

Alternatively, developers can port their applications to Tilera's iLib C library, aimed at exploiting parallelism while still supporting standard system calls. The approach appears to resemble that used in Intel's Threading Building Blocks, recently released under an open source license.

Finally, for users wishing to manually tune multi-core application performance, Tilera will offer a full "MDE" (multicore development environment) toolsuite based on Eclipse. In addition to a full IDE (integrated development environment), MDE includes a parallel debugger, along with an application profiler aimed at helping developers figure out what parts of their code to optimize for multicore.

Tilera is in discussions with "all major Linux support providers," Dowd said, adding, "We'll have ecosystem announcements coming out as we line them up."

Early markets, customers, reference implementations

The first Tile64 chips target network and video devices that require significant application processing, such as surveillance systems, and firewalls with deep packet inspection. Early customers in the networking space reportedly include 3Com and firewall vendor TopLayer, while early video customers reportedly include U.K.-based high-definition videoconferencing equipment provider Codian, and BackupTV, a vendor of network-based video recording and other head-end services for cable TV network providers.

To hasten adoption, Tilera is offering processor daughterboards implemented as PCI Express cards with six or 12 gigabit Ethernet ports. The cards can be used in production systems with passive backplanes, or as targets in development hosts, Dowd said. He declined to specify pricing.

Tilera TILExpress-64 PCIe card and its architecture
(Click architecture diagram to enlarge)

CEO Devesh Garg stated, "This is the first significant new development in chip architecture in a decade. We developed this new architecture because existing multicore technologies simply cannot scale. Moreover, customers have repeatedly indicated that the current multicore software tools are very primitive because they are based on single-processor-core models. We're introducing a revolutionary hardware and software platform that has solved the fundamental challenges associated with multicore scalability."


The Tile64 is available now, in three variants differentiated by I/O mix and clock. Pricing starts at $435 in 10,000 quantities, the company said. Tilera's iLib and MDE tools, and TilExpress-64 board are also available at undisclosed pricing.

--Henry Kingman