Skip to content

nglee/opencv

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

OpenCV Contributions — Namgoo Lee

Contributed 32 merged upstream pull requests to OpenCV's CUDA stack — 29 in opencv and 3 in opencv_contrib

Key impact areas

  • Designed and implemented GpuMatND — an N-dimensional GPU matrix type for CUDA. (#19259, interoperability test code in contrib #2805)
  • Eliminated multi-stream race conditions across the CUDA module — data races and hangs in FAST, Canny, StereoBM, and TVL1 optical flow. (#10906, #11483, #13850, #17556)
  • Achieved bit-exact GPU/CPU parity for histogram equalization. (#18136)
  • Improved CUDA toolkit compatibility across multiple releases (CUDA 9–10.1), keeping the build green. (#14000, #13958, #13658, #13596)
  • Added in-place NPP paths for cuda::flip and GpuMat::convertTo, avoiding extra GPU buffers. (#17863, ported to contrib #2612; #17982)

Merged PRs by theme

GPU data structures & memory

Thread / multi-stream safety

Numerical correctness & algorithm fixes

Build, CI & toolchain compatibility

Refactoring & code quality

Documentation

Code Review & Design Discussion

Contributed via review, design discussion, and code to PRs authored by others — most of which merged:

Full list of PRs I was involved in but did not author (for reference) — opencv, opencv_contrib

[#19534] cudafilters: remove dangerous race condition (merged)
[#19286] add cuda::Stream constructor with cuda stream flags (merged)
[#17671] CUDA: fix native detection on Jetson (merged)
[#17581] CUDA: fix build error on Jetson TX1 and TX2 (merged)
[#17432] CUDA: choose supported CC automatically (merged)
[#16666] [WIP] Add GpuMatND with arbitrary dimension support (closed)
[#13695] Fix cuda::filter corrupted output across threads/streams (merged)
[#12722] cudafilters: fix test failure of Median_Accuracy (merged)
[#12585] cuda: move CUDA modules to opencv_contrib (merged)
[#11951] cmake: allow to use external FindCUDA from modern CMake (merged)
[#11064] cudaarithm: make the asynchronous call to NPP safe (merged)

Issue investigation & design proposals

Diagnosed community bug reports and shipped the merged fix:

  • #18035 (non-deterministic CUDA equalizeHist) → fixed in #18136
  • #17840 (in-place GpuMat flip artifacts) → fixed in #17863
  • #13092 (GpuMat::convertTo in-place) → fixed in #17982
  • #16013 / #18155 (TVL1 optical flow unsafe in async/multithreaded use) → fixed in #17556

Reported and fixed myself:

  • #8725 (stray cudaMalloc() from Stream::Null()) → fixed in #10751

Design proposals / RFCs I opened:

Full list of issues I reported or was involved in (for reference) — opencv, opencv_contrib

[#24115] RFC cuda::Stream — documentation issue and usage inconsistency
[#18347] cudaarithm: inplace version of NPP flip fails with odd number ROI
[#18155] cuda_OpticalFlowDual_TVL1 is not thread-safe in python
[#18051] CUDA GoodFeaturesToTrackDetector is not ThreadSafe ?
[#18035] CUDA equalizeHist does not produce identical result
[#17840] In-place flip of GpuMat produces image artifacs
[#16433] GpuMat as input/output to cv::dnn::Net
[#16013] Corrupted optical flow using cuda::DenseOpticalFlow asynchronously in multithreaded environment
[#13092] cv::cuda::GpuMat.convertTo() seems not to support in-place, while cv::Mat does
[#2724] (contrib) Error building with BUILD_CUDA_STUB on machine without CUDA
[#2361] (contrib) Bug in cv::cuda::warpPerspective
[#14052] an illegal memory access was encountered in function 'download'
[#14017] Opencv 4.0.1 with Cuda
[#13996] opencv-4.0.1, CUDA10.1, failed to build cudaimageproc
[#13984] Problem compiling clahe.cu — identifier "PtrStepus" is undefined
[#13952] OpenCV 4.0.1 + Cuda 10.1, failed to build?
[#13897] Failed to build OpenCV 4.0.1 with CUDA 10 10.0
[#13883] Template Matching is not threadsafe
[#13761] cudalegacy NCVHaarObjectDetection hangs with RTX 2080 Ti
[#13491] Error when building with CUDA. VS 2017, Win10.
[#1958] (contrib) Feature request: Cuda CLAHE for 16 bit images
[#13477] cuda::createTemplateMatching not work with CUDA10.0
[#13014] cuda blockScanInclusive hangs with RTX 2080
[#12895] cudaoptflow: test failure of FarnebackOpticalFlow
[#12721] cudafilters: Median_Accuracy fails with CUDA 9.0 and after
[#12320] cv::cuda::integral hangs on Titan V
[#11622] CUDA Median filter tests fail with CUDA 9.1 but pass with CUDA 8.0
[#11606] Suggestion for the CUDA stream module
[#11511] unneeded cudaStreamSynchronize(stream_)
[#11298] bug in MemoryReturn in cuda module
[#11063] cudaarithm: async call to NPP fails
[#8938] Can –default-stream per-thread be used with opencv ?
[#8725] Calling cv::cuda::Stream::Null() results in a stray cudaMalloc() call
[#6742] cv::cuda::Filter thread safety

About

List of merged PRs

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors