Solution 1

You don't need to do a bunch of image mangling yourself to use a Core ML model with an image — the new Vision framework can do that for you.

import Vision
import CoreML

let model = try VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model)
let request = VNCoreMLRequest(model: model, completionHandler: myResultsMethod)
let handler = VNImageRequestHandler(url: myImageURL)

func myResultsMethod(request: VNRequest, error: Error?) {
    guard let results = request.results as? [VNClassificationObservation]
        else { fatalError("huh") }
    for classification in results {
        print(classification.identifier, // the scene label


The WWDC17 session on Vision should have a bit more info — it's tomorrow afternoon.

Solution 2

You can use a pure CoreML, but you should resize an image to (224,224) .userInitiated).async {
        // Resnet50 expects an image 224 x 224, so we should resize and crop the source image
        let inputImageSize: CGFloat = 224.0
        let minLen = min(image.size.width, image.size.height)
        let resizedImage = image.resize(to: CGSize(width: inputImageSize * image.size.width / minLen, height: inputImageSize * image.size.height / minLen))
        let cropedToSquareImage = resizedImage.cropToSquare()

        guard let pixelBuffer = cropedToSquareImage?.pixelBuffer() else {
        guard let classifierOutput = try? self.classifier.prediction(image: pixelBuffer) else {

        DispatchQueue.main.async {
            self.title = classifierOutput.classLabel

// ...

extension UIImage {

    func resize(to newSize: CGSize) -> UIImage {
        UIGraphicsBeginImageContextWithOptions(CGSize(width: newSize.width, height: newSize.height), true, 1.0)
        self.draw(in: CGRect(x: 0, y: 0, width: newSize.width, height: newSize.height))
        let resizedImage = UIGraphicsGetImageFromCurrentImageContext()!

        return resizedImage

    func cropToSquare() -> UIImage? {
        guard let cgImage = self.cgImage else {
            return nil
        var imageHeight = self.size.height
        var imageWidth = self.size.width

        if imageHeight > imageWidth {
            imageHeight = imageWidth
        else {
            imageWidth = imageHeight

        let size = CGSize(width: imageWidth, height: imageHeight)

        let x = ((CGFloat(cgImage.width) - size.width) / 2).rounded()
        let y = ((CGFloat(cgImage.height) - size.height) / 2).rounded()

        let cropRect = CGRect(x: x, y: y, width: size.height, height: size.width)
        if let croppedCgImage = cgImage.cropping(to: cropRect) {
            return UIImage(cgImage: croppedCgImage, scale: 0, orientation: self.imageOrientation)

        return nil

    func pixelBuffer() -> CVPixelBuffer? {
        let width = self.size.width
        let height = self.size.height
        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                     kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
        var pixelBuffer: CVPixelBuffer?
        let status = CVPixelBufferCreate(kCFAllocatorDefault,

        guard let resultPixelBuffer = pixelBuffer, status == kCVReturnSuccess else {
            return nil

        CVPixelBufferLockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        let pixelData = CVPixelBufferGetBaseAddress(resultPixelBuffer)

        let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
        guard let context = CGContext(data: pixelData,
                                      width: Int(width),
                                      height: Int(height),
                                      bitsPerComponent: 8,
                                      bytesPerRow: CVPixelBufferGetBytesPerRow(resultPixelBuffer),
                                      space: rgbColorSpace,
                                      bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else {
                                        return nil

        context.translateBy(x: 0, y: height)
        context.scaleBy(x: 1.0, y: -1.0)

        self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
        CVPixelBufferUnlockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))

        return resultPixelBuffer

The expected image size for inputs you can find in the mimodel file: enter image description here

A demo project that uses both pure CoreML and Vision variants you can find here:

Solution 3

If the input is UIImage, rather than an URL, and you want to use VNImageRequestHandler, you can use CIImage.

func updateClassifications(for image: UIImage) {

    let orientation = CGImagePropertyOrientation(image.imageOrientation)

    guard let ciImage = CIImage(image: image) else { return }

    let handler = VNImageRequestHandler(ciImage: ciImage, orientation: orientation)


From Classifying Images with Vision and Core ML


    I am trying to get Apple's sample Core ML Models that were demoed at the 2017 WWDC to function correctly. I am using the GoogLeNet to try and classify images (see the Apple Machine Learning Page). The model takes a CVPixelBuffer as an input. I have an image called imageSample.jpg that I'm using for this demo. My code is below:

            var sample = UIImage(named: "imageSample")?.cgImage
            let bufferThree = getCVPixelBuffer(sample!)
            let model = GoogLeNetPlaces()
            guard let output = try? model.prediction(input: GoogLeNetPlacesInput.init(sceneImage: bufferThree!)) else {
                fatalError("Unexpected runtime error.")

    I am always getting the unexpected runtime error in the output rather than an image classification. My code to convert the image is below:

    func getCVPixelBuffer(_ image: CGImage) -> CVPixelBuffer? {
            let imageWidth = Int(image.width)
            let imageHeight = Int(image.height)
            let attributes : [NSObject:AnyObject] = [
                kCVPixelBufferCGImageCompatibilityKey : true as AnyObject,
                kCVPixelBufferCGBitmapContextCompatibilityKey : true as AnyObject
            var pxbuffer: CVPixelBuffer? = nil
                                attributes as CFDictionary?,
            if let _pxbuffer = pxbuffer {
                let flags = CVPixelBufferLockFlags(rawValue: 0)
                CVPixelBufferLockBaseAddress(_pxbuffer, flags)
                let pxdata = CVPixelBufferGetBaseAddress(_pxbuffer)
                let rgbColorSpace = CGColorSpaceCreateDeviceRGB();
                let context = CGContext(data: pxdata,
                                        width: imageWidth,
                                        height: imageHeight,
                                        bitsPerComponent: 8,
                                        bytesPerRow: CVPixelBufferGetBytesPerRow(_pxbuffer),
                                        space: rgbColorSpace,
                                        bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue)
                if let _context = context {
                    _context.draw(image, in: CGRect.init(x: 0, y: 0, width: imageWidth, height: imageHeight))
                else {
                    CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);
                    return nil
                CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);
                return _pxbuffer;
            return nil

    I got this code from a previous StackOverflow post (last answer here). I recognize that the code may not be correct, but I have no idea of how to do this myself. I believe that this is the section that contains the error. The model calls for the following type of input: Image<RGB,224,224>

