Metal3편 - 메모리 사용량 급증 버그 수정

SWIFT개발일지

Metal3편 - 메모리 사용량 급증 버그 수정

2료일 2025. 4. 20. 21:31

https://codeisfuture.tistory.com/119

Metal(2)-셰이더 코드 작성까지

이전 글에서 메탈이란 GPU에 접근하여 빠른 그래픽 처리를 가능하게 해주는 저수준 API라는 것을 학습했다. 즉 Spiritekit, Animation 밑의 있는 것!! 이번엔 그래서 메탈이 어떻게 렌더링을 하는지 살펴

codeisfuture.tistory.com

기존의 문제점

  func applyFilter(_ image: UIImage, filtertype: String, intensity: Float) async -> UIImage {

        
        let startTime = CACurrentMediaTime()

        guard let cgImage = image.resize(targetSize: CGSize(width: 1048, height: 1048)).cgImage else {return image}
            let textureLoader = MTKTextureLoader(device: device)
            guard let inTexture = try? await textureLoader.newTexture(cgImage: cgImage, options: nil) else {return image}
        let descriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: inTexture.pixelFormat, width: inTexture.width, height: inTexture.height, mipmapped: false)
        descriptor.usage = [.shaderRead, .shaderWrite]
        guard let outTexture = device.makeTexture(descriptor: descriptor) else { return image }
        
        let pipelineState: MTLComputePipelineState
        do {
            pipelineState = try await getPipelineState(for: filtertype)
        } catch {
            print("🔴 파이프라인 생성 실패: \(error.localizedDescription)")
            return image
        }
        guard let commandBuffer = commandQueue.makeCommandBuffer(),
              let encoder = commandBuffer.makeComputeCommandEncoder() else {
            return image
        }
        encoder.setComputePipelineState(pipelineState)
        encoder.setTexture(inTexture, index: 0)
        encoder.setTexture(outTexture, index: 1)
        var intensityValue = intensity
        encoder.setBytes(&intensityValue, length: MemoryLayout<Float>.size, index: 0)
        
        let threadGroupSize = MTLSize(width: 16, height: 16, depth: 1)
        let threadGroupCount = MTLSize(
            width: (inTexture.width + threadGroupSize.width - 1) / threadGroupSize.width,
            height: (inTexture.height + threadGroupSize.height - 1) / threadGroupSize.height,
            depth: 1
        )
        encoder.dispatchThreadgroups(threadGroupCount, threadsPerThreadgroup: threadGroupSize)
        encoder.endEncoding()
        
        commandBuffer.commit()
        commandBuffer.waitUntilCompleted()
        
        let ciImage = CIImage(mtlTexture: outTexture)!
        let context = CIContext()
        guard let cgImageOut = context.createCGImage(ciImage, from: ciImage.extent) else {
            return image
        }
        return UIImage(cgImage: cgImageOut, scale: image.scale, orientation: .up)
    }

해당 필터정보와 강도를 applyFilter를 통해 UIImage객체로 만들어 주었다. 하지만 intensity를 slider로 조절이 가능했습니다. 자 이제 제 TCA 아키텍처를 보면

       case .binding(\.sliderValue):
                guard let filter = state.selectedFilter else {return .none}
                return .run {[originalImage = state.originalImage, slider = state.sliderValue] send in
                    let filterimage = try await metalFilterClient.applyFilter(originalImage,filter.metalFunction,Float(slider))
                    await send(.changeFilterImage(filterimage))
                }.throttle(id: "sliderUpdate", for: 0.1, scheduler: DispatchQueue.main, latest: true)

이렇게 슬라이더밸류가 바뀔대마다 applyFilter를 해줌으로써 UIImage 인스턴스가 생성이 되었습니다. 최악의 상황에서는 0부터 1사이 0.01단위로 움직였는데 100개의 이미지 객체가 메모리에 생성됩니다. 각 이미지 객체는 메모리가 크기에 몇번만 하면 앱이 강제 종료되어버리고 OS가 메모리를 회수해가서 튕겨버리는 현상 발생....

-> 슬라이더가 움직일때마다 원본 이미지를 Metal 텍스처로 변환

-> 필터 파이프 라인 설정 -> 필터 적용 -> 결과를 다시 UIImage로 변환

이 과정이 매번 반목되어 메모리 사용량이 급증하는 문제가 있었습니다

해결방법: 텍스처와 파이프라인 재사용

1. 필터를 처음 선택할 때만 초기 설정 수행

    func setupFilter(_ image: UIImage, filtertype: String) async {
        self.currentImage = image
        self.currentFilter = filtertype
        preparedTexture = await makeMTLTexture(from: image)
        do {
            preparedPipelineState = try await getPipelineState(for: filtertype)
        } catch {
            print("🔴 파이프라인 생성 실패: \(error.localizedDescription)")
        }
    }

2. 슬라이더 변경 시 텍스처와 파이프라인 재사용

   func updateIntensity(_ intensity: Float) async -> UIImage {
        let startTime = CACurrentMediaTime()
        guard let inTexture = preparedTexture,
              let pipelineState = preparedPipelineState, let currentImage = currentImage else {
            if let img = currentImage, let filter = currentFilter {
                return await applyFilter(img, filtertype: filter, intensity: intensity)
            }
            return UIImage()
        }
        
        guard let outTexture = createOutputTexture(matching: inTexture) else {return currentImage}
        print("outTexture pixelFormat: \(outTexture.pixelFormat.rawValue)") // 확인용 로그
        // 필터 적용
        guard let commandBuffer = commandQueue.makeCommandBuffer(),
              let encoder = commandBuffer.makeComputeCommandEncoder() else {
            return currentImage
        }
        
        encoder.setComputePipelineState(pipelineState)
        encoder.setTexture(inTexture, index: 0)
        encoder.setTexture(outTexture, index: 1)
        var intensityValue = intensity
        encoder.setBytes(&intensityValue, length: MemoryLayout<Float>.size, index: 0)
        
        let threadGroupSize = MTLSize(width: 16, height: 16, depth: 1)
        let threadGroupCount = MTLSize(
            width: (inTexture.width + threadGroupSize.width - 1) / threadGroupSize.width,
            height: (inTexture.height + threadGroupSize.height - 1) / threadGroupSize.height,
            depth: 1
        )
        encoder.dispatchThreadgroups(threadGroupCount, threadsPerThreadgroup: threadGroupSize)
        encoder.endEncoding()
        
        commandBuffer.commit()
        commandBuffer.waitUntilCompleted()
        
        guard let filteredImage = convertTextureToUIImage(outTexture, filterImage: currentImage) else {return currentImage}
        let endTime = CACurrentMediaTime()
        print("Filter applied in \(endTime - startTime) seconds")
        return filteredImage
    }

리소스 재사용
메탈 렌더링 파이프라인에서 가장 비용이 큰 작업은 이미지에서 메탈 텍스처로 변환, 셰이더 컴파일 및 파이프라인 상태 생성이였다. 그래서 이 작업들을 한번만 수행하고 재사용함으로써 성능을 향상시킬수 있었다.
메모리 사용 최적화
매 슬라이더 변경마다 새 이미지->새 텍스쳐-> 새 파이프라인 = 메모리 사용량O(N)
최적화 후: 입력 텍스처와 파이프라인은 재사용/ 출력 텍스처만 새로 생성.= 메모리 사용량 O(1)
CPU/GPU 동기화 감소
MetalFilterRenderService에서 필요한 상태 변수를저장함으로써 최적화 완료!

2번째 이슈 전체적으로 사진이 밝아져 버렷다

이를 위해서는 sRGB/Linear를 알아야한다.

sRGB 색공간

정의: sRGB는 디스플레이와 이미지 저장에 널리 사용되는 표준 색공간입니다. 비선형(non-linear) 특성을 가지며, 인간의 눈이 밝기를 인지하는 방식에 맞춰 색상 값이 조정됩니다.
특징: sRGB 값은 감마 보정(gamma correction)이 적용되어 있어, 낮은 값(어두운 영역)에서 더 많은 데이터를 저장하고 높은 값(밝은 영역)에서는 데이터를 압축합니다. 예: sRGB 값 0.5는 선형 밝기 값으로 약 0.218(0.5^2.2)에 해당합니다.
일반적인 사용: UIImage와 CGImage는 기본적으로 sRGB 형식으로 저장됩니다.

Linear 색공간

정의: Linear 색공간은 실제 빛의 강도를 선형적으로 나타냅니다. 감마 보정이 적용되지 않아 계산에 적합합니다.
특징: 그래픽 렌더링에서 조명 계산, 필터 연산 등은 Linear 공간에서 수행해야 정확한 결과를 얻을 수 있습니다.
Metal에서의 역할: Metal 셰이더는 일반적으로 Linear 색공간에서 연산을 수행하며, 입력 및 출력 텍스처의 색공간에 따라 자동으로 변환을 처리합니다.

색공간 불일치의 결과

Metal에서 텍스처를 읽거나 쓸 때 색공간이 잘못 해석되면, 색상 값이 의도와 다르게 변환되어 이미지가 밝아지거나 색이 왜곡될 수 있습니다.
예시: sRGB 값을 Linear 값으로 잘못 해석하면, 값이 과대 평가되어 결과가 더 밝아집니다. 반대로 Linear 값을 sRGB로 잘못 저장하면 어두워질 수 있습니다.

Metal에서의 색공간 처리

Metal에서 색공간 처리는 텍스처 생성과 셰이더 연산 과정에서 결정됩니다. 주요 요소는 다음과 같습니다

MTKTextureLoader 옵션

MTKTextureLoader는 CGImage를 MTLTexture로 변환할 때 색공간을 지정하는 데 사용됩니다.
옵션 .SRGB:
- true: 입력 데이터가 sRGB임을 나타내며, 텍스처의 pixelFormat이 .bgra8Unorm_srgb로 설정됩니다. 셰이더에서 읽을 때 sRGB → Linear로 자동 변환됩니다.
- false: 입력 데이터를 Linear로 간주하며, pixelFormat이 .bgra8Unorm으로 설정됩니다. 변환 없이 raw 값이 사용됩니다.
- 기본값: 옵션을 지정하지 않으면 CGImage의 색공간을 참조하며, 일반적으로 sRGB로 간주됩니다.

MTLTextureDescriptor의 pixelFormat

pixelFormat은 텍스처의 데이터 형식을 정의합니다:
- .bgra8Unorm_srgb: sRGB 텍스처로, 읽기 시 Linear로 변환되고 쓰기 시 Linear → sRGB로 변환됩니다.
- .bgra8Unorm: Linear 텍스처로, 색공간 변환이 발생하지 않습니다.
입력과 출력 텍스처의 pixelFormat이 일치해야 색상이 올바르게 유지됩니다.

CIContext의 색공간 설정

CIContext는 MTLTexture를 CGImage로 변환할 때 사용됩니다.
옵션:
- .workingColorSpace: 연산이 수행되는 색공간(예: linearSRGB).
- .outputColorSpace: 출력 이미지의 색공간(예: sRGB).
텍스처의 pixelFormat과 일치해야 올바른 변환을 보장합니다

나의 코드를 보며 어디가 잘못되었는지 살펴보자.

let options: [MTKTextureLoader.Option: Any] = [
    .SRGB: false,
    .generateMipmaps: false
]
guard let inTexture = try? await textureLoader.newTexture(cgImage: cgImage, options: options) else { return image }

.SRGB: false로 설정했으므로, inTexture의 pixelFormat은 .bgra8Unorm이 됩니다. 하지만 cgImage는 sRGB 형식입니다. 이 경우, sRGB 데이터를 Linear로 잘못 간주하게 되어 셰이더에서 inTexture를 읽을 때 sRGB 값이 변환 없이 raw 값으로 사용됩니다. 이는 색상이 부정확하게 해석되는 결과를 초래합니다.

let context = CIContext(options: [
    .outputColorSpace: CGColorSpace(name: CGColorSpace.sRGB)!,
    .workingColorSpace: CGColorSpace(name: CGColorSpace.linearSRGB)!
])
guard let cgImageOut = context.createCGImage(ciImage, from: ciImage.extent) else { return image }

출력 텍스처(outTexture)가 .bgra8Unorm으로 설정되었다고 가정하면, Core Image는 이를 Linear로 간주합니다. .outputColorSpace가 sRGB로 설정되어 있으므로, Linear 데이터를 sRGB로 변환하여 출력합니다. 하지만 입력에서 sRGB 데이터를 Linear로 잘못 해석했기 때문에, 전체 색상 처리가 왜곡됩니다.

캐시

어차피 초반에 썸네일 로드할때 모든 필터를 적용한다. 여기서 이미 파이프라인을 생성하기에 캐시를 하면 더 빨라질꺼라 생각했다.

그 결과 초반에 applyFilter를 했을때는 작업의 속도가 하나당.

Filter applied in 1.4334406666675932 seconds

물론 병렬처리를 해서 모든 작업이 2초내에 끝난다. 하지만 캐시를 해서 그 이후에 작업은 어떻게 될까?

Filter applied in 0.10101529166422551 seconds

무려 14.2배 시간 효율성을 증가 시킬수 있었다.