WO2014008541A1 - Video processing method and system - Google Patents

Video processing method and system Download PDF

Info

Publication number
WO2014008541A1
WO2014008541A1 PCT/AU2013/000758 AU2013000758W WO2014008541A1 WO 2014008541 A1 WO2014008541 A1 WO 2014008541A1 AU 2013000758 W AU2013000758 W AU 2013000758W WO 2014008541 A1 WO2014008541 A1 WO 2014008541A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
salient
saliency
regions
source video
Prior art date
Application number
PCT/AU2013/000758
Other languages
English (en)
French (fr)
Inventor
Ivan HIMAWAN
Wei Song
Dian Wirawan TJONDRONEGORO
Original Assignee
Smart Services Crc Pty Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2012902926A external-priority patent/AU2012902926A0/en
Application filed by Smart Services Crc Pty Limited filed Critical Smart Services Crc Pty Limited
Publication of WO2014008541A1 publication Critical patent/WO2014008541A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object

Definitions

  • the present invention relates to a video processing method and system
  • Media streaming' is a multimedia distribution process where media content is presented to an end user as it is received.
  • the streamed content is typically
  • the term 'streaming media' typically represents media that is 'streamed' over a telecommunications network (as opposed to conventional broadcast media, such as radio and television etc) .
  • Streaming allows a client system to utilise segments of a media file before the entire file has been received.
  • Streaming media files may be played directly from the streaming source (live streaming or true streaming) or stored temporally on the client computing system during 'playback' (on demand streaming or progressive streaming) .
  • Some streaming applications inherently dictate the how the media stream is distributed (ie. interactive media, such as video and audio conferencing streams, is usually distributed by live streaming to avoid delays) .
  • Media streaming often requires a reliable, high bandwidth network connection between the media distributor (typically a server system) and the client system (such as a personal computer, mobile phone or tablet computer) .
  • the connection between the distributor and client may be routed through several networks (such as local area networks, cellular networks and other distribution
  • Video streaming is typically the most bandwidth intensive form of media streaming. This is primarily due to the large size of video media files. Video streaming is used in various applications, including internet television, video-on-demand television, voice-over-IP and video conferencing. The bandwidth consumed by video streaming is usually related to the quality of the video being streamed.
  • Media content may be compressed to alleviate network loading during streaming. Compression reduces the size of the media files by reproducing the content using fewer bits. This can affect the quality of the content received by the client system.
  • the invention provides a video processing method comprising:
  • each saliency map identifying salient regions of the corresponding frame
  • the invention provides a video processing comprising:
  • video processing method comprising:
  • each saliency map identifying salient regions of the corresponding frame
  • the invention provides a video processing method comprising identifying salient regions for individual frames of a source video and reproducing the source video with a lower bit rate by reducing bit allocation to non-salient frame regions.
  • the invention provides a video processing system comprising:
  • mapper that generates a saliency map for a
  • each saliency map identifying salient regions of the corresponding frame, an encoder that reproduces the source video at a target bit rate by reducing the number of bits allocated to non-salient frame regions identified from the saliency maps, and
  • the invention provides a video processing system comprising:
  • mapper that generates a plurality of distinct saliency maps at prescribed resolutions for individual frames of a source video, each saliency map identifying salient regions of the corresponding frame,
  • a compiler module that constructs a composite saliency map of each of the individual frames from the respective distinct saliency maps and a regional weighting that emphasises a centralised region of the frame, and an encoder that reproduces the source video at a lower bit rate by reducing the number of bits allocated to non-salient frame regions identified from the composite saliency maps.
  • the invention provides a video processing system comprising a mapper that identifies salient regions for individual frames of a source video and an encoder that reproduces the source video with a lower bit rate by reducing bit allocation to non-salient frame regions.
  • Figure 1 is a schematic representation of a media distribution system including a media server, a plurality of clients and a data network.
  • Figure 2 is a flow chart representation of a video processing method that involves reproducing salient regions of a video frame with a greater bit allocation than non-salient regions.
  • Figure 3 is a flow chart representation of a method for generating saliency maps for frames of a source video.
  • Figure 4a is a fine resolution saliency map of a frame from a source video depicting an aircraft flying over a desert.
  • the saliency map is presented at the same scale as the frame from the source video.
  • Figure 4b is an intermediate resolution saliency map generated from the same frame as the saliency map
  • Figure 4c is a coarse resolution saliency map generated from the same frame as the saliency map
  • Figure 4d is a regional weighting that prioritises a central frame region.
  • the regional weighting is presented at the same scale as the saliency maps presented in
  • Figure 5a is a composite saliency map generated from the distinct saliency maps presented in Figure 4a to 4c combined with the regional weighting presented in Figure 4d.
  • Figure 5b is a reproduction of the source frame with the salient regions detected from the composite saliency map presented in Figure 5a outlined.
  • Figure 6a is a frame from a source video illustrating a new presenter seated toward the right of the frame.
  • the image illustrates the movement detection capabilities of a saliency procedure.
  • Figure 6b is another frame from the same source video as Figure 6a illustrating an incremental movement of the presenter head and the corresponding regions of the frame identified as salient based on the presenters movement.
  • Figure 6c is another frame from the same source video as Figures 6a and 6b illustrating movement of the
  • presenter head and right arm and the corresponding regions of the frame identified as salient based on the presenters movement.
  • the illustrated distribution system 101 has a plurality of nodes 102, 104, 106 that are interconnected by a data network 108.
  • the data network 108 may comprise several sub-networks (such as fibre optic networks, cellular networks and local area networks) .
  • the illustrated distribution network comprises a media server 102 and a plurality of clients 104, 106.
  • the media server 102 and the clients 104, 106 each have a dedicated network interface 110 that facilitates
  • the respective network interfaces 110 may implement different protocols to comply with functional requirements of the sub-network they connect to (such as wireless, wired or fibre optic transmission requirements) and exhibit different characteristics (such as band width limitations and speed) .
  • the illustrated media server 102 is associated with hardware resources.
  • the hardware resources include a processor 118, volatile memory 116 (such as random access memory or RAM) , non-volatile memory 112, a network interface 110 and a bus controller 114.
  • the server hardware enables the media server 101 to function as a media host.
  • the server 102 may be implemented in any suitable computing architecture, such as a cloud based system, a virtual machine operating on shared hardware, a dedicated hardware machine or a server bank of connected hardware and/or virtual machines.
  • a cloud based system such as a Compute resource pool, a Compute resource pool, or a Compute resource pool.
  • a virtual machine operating on shared hardware such as a Compute resource pool.
  • a dedicated hardware machine such as a server bank of connected hardware and/or virtual machines.
  • server bank such as general computing system
  • the media server 102 is the source of media files in the illustrated distribution network 101.
  • the media files are stored in non-volatile computer memory 112 within the media server 102.
  • the server 102 may be dedicated to a particular media type (such as a dedicated video server) or store a variety of media files (including video, audio, image, text and interactive media) .
  • the illustrated media server 102 makes the media files available to clients 104, 106 through the data network 108 (including any subnetworks) .
  • the media server 102 may process the media files before they are transmitted over the data network 108 to clients 104, 106.
  • the illustrated media server 102 incorporates a processor 108 and volatile memory module 116 that facilitates media processing as well as general operation of the server.
  • a bus controller 114 regulates data transfer within the server 102 and through the network interface 110.
  • the media server 102 may distribute media files to a variety of different clients, such as personal computers 104 and mobile phones 106 (as illustrated in figure 1) .
  • the media server 102 is ideally coupled to the network 108 by high speed data connection 103 that can accommodate simultaneous requests from a plurality of clients without significant disruption to media distribution.
  • the server connection 103 between the media server 102 and the general data network 108 may become overloaded in exceptional circumstances.
  • the client connections 105, 107 typically impose a greater restriction on network traffic than the media server connection 103.
  • Localised client restrictions may arise from client specific sub-networks (such as a cellular network) any auxiliary networks the media data is routed through (such as a local area network), service provider data controls the client is subjected to or other band width limitations.
  • Cellular networks represented by connection 107 and mobile phone 106 in figure 1) are particularly vulnerable to data restrictions.
  • the server 102 may compress media files before distributing them across the network 108 to reduce network traffic. This is particularly applicable to video, audio and image files.
  • a video processing method that reduces the transmission size of a source video is illustrated in the flow chart presented in figure 2.
  • the illustrated processing method 201 exploits characteristics of the human visual system to reduce perceived degradation of image quality in compressed video media.
  • Foveal vision' or more colloquially central vision
  • Foveal vision allows humans to discern fine details of objects or images that are positioned in a centralised region of their gaze.
  • Fovea vision accounts for the central two to three degrees of a person's visual field. At a typical reading distance of 30cm, a person' s foveal vision covers a width of approximately 2cm. The remainder of the human visual field is made up of peripheral vision which is less discerning of detail.
  • a method for prioritised encoding of salient frame regions is presented in the flow chart 201 illustrated in Figure 2.
  • the illustrated video processing method has two primary steps. These steps are shown in the left hand column of the flow chart .
  • Salient frame regions from a source video are identified in the first primary step (step 210)
  • FIG. 2 This operation is typically performed by a suitable computing system (such as the media server 102 illustrated in figure 1) .
  • the computing system identifies salient regions from individual frames of the source video.
  • a plurality of consecutive frames from a source video are typically processed to establish regional saliency (although frame sampling may be used to reduce processing load in some applications) .
  • the location of salient regions from each frame is documented by the computing system during processing.
  • the salient regions may be referenced by a macroblock address or another frame addressing system.
  • the saliency information obtained in step 210 is used to reproduce the source video at a lower bit rate in step 220 of the flow chart 201.
  • the bit rate is reduced by reducing the bit allocation to non salient frame regions. This is typically accomplished by encoding the source video using a custom compression algorithm.
  • step 210 are passed to the encoder during compression of the source video.
  • the encoder uses the salient frame references to distinguish salient frame regions from non-salient frame regions.
  • Salient frame regions may be reproduced with the same bit allocation as the source video or a bit
  • the steps illustrated in the right hand column of figure 2 demonstrates a particular implementation of the primary steps (steps 210 and 220) illustrated in the left hand column.
  • the implementation steps (shown on the right of figure 2) are divided into two groups that relate the respective primary steps (210 and 220) as illustrated.
  • Identification of salient frame regions begins with the generation of a plurality of distinct saliency maps in step 212.
  • the saliency maps are
  • Each saliency map identifies the salient regions from a corresponding frame of the source video.
  • a plurality of saliency maps are used for the same frame to improve detection of salient objects within the depicted scene. Combining multiple saliency maps
  • the saliency maps may be generated by the server 102 using a plurality of scaled frame reproductions from the source video.
  • the scaled frame reproductions are ideally produced at the resolutions prescribed for the saliency maps.
  • the server 102 can then construct a frequency representation of each of the scaled reproductions. This corresponds to step 310 of the flow chart 301 illustrated in Figure 3.
  • a method of generating a suitable frequency representation is presented later in this specification.
  • the distinct saliency maps are generated from the
  • the server 102 generates the distinct saliency maps for each frame by extracting the phase information contained in each frequency representation. The extracted phase information is used to derive a phase spectrum for each frequency representation. An individual saliency maps can then be constructed from the phase spectrum of each scaled frame reproduction. The resulting saliency maps are scaled to a complimentary resolution.
  • FIGS. 4a to 4c Three distinct saliency maps that have been generated for the same frame of a source video are illustrated in figures 4a to 4c.
  • the selected frame of the source video depict an aircraft in the lower half of the scene, approximately midway between either side of the frame.
  • the aircraft 401 is depicted flying over a desert.
  • a sand dune 405 is captured in an upper left section of the frame in the aircrafts general flight path.
  • the saliency maps 410, 420, 430 have been generated at distinct resolutions.
  • the illustrated maps 410, 420, 430 are reproduced from the source video with 320 x 180 pixels, 160 x 90 pixels and 80 x 45 pixels resolution respectively.
  • Each map 410, 420, 430 has been rescaled to the frame size of the source video in figures 4a to 4c.
  • the first saliency map 410 Generating distinct saliency maps at different resolutions simulates human visual perception at different distances from a scene.
  • the refined saliency map 410 illustrated in figure 4a emphasises localised salient features of the image, which include the fuselage 403, engines 402 and wings 404 of the aircraft 401 in the illustrated embodiment.
  • the coarse saliency map 430 emphasises globally salient features of the frame, such as a general outline of the aircraft 401, without clearly defined localised features.
  • Each of the distinct saliency maps 410, 420, 430 are combined to form a composite saliency map. This
  • FIG. 5a A composite saliency map 450 for the illustrated scene is presented in figure 5a.
  • the composite saliency map 450 illustrated in figure 5a has been constructed from the distinct saliency maps 410, 420, 430 illustrated in figures 4a to 4c with weightings of 0.6, 0.3 and 0.1 respectively.
  • the server 102 may linearly adjust the intensity values of each map prior to constructing the composite saliency map. This corresponds to step 334 of the flow chart 301 illustrated in Figure 3.
  • the server 102 may also compensate for outliers contained in the respective saliency maps during the linear adjustment procedure.
  • One adjustment method involves saturating the top 2% and bottom 2% of all pixel values before distributing the remaining pixel values linearly between 0 and 1.
  • the illustrated composite saliency map 450 also incorporates a regional weighting that prioritises a centralised region of the frame. This corresponds to step 216 of the flow chart 201 illustrated in Figure 2 and step 338 of the flow chart 301 illustrated in Figure 3.
  • a graphical representation of the centralised regional weighting used to construct composite saliency map 450 is presented in figure 4d.
  • the centralised regional weighting 440 may be applied to each of the distinct saliency maps 410, 420, 430 individually or in a single operation following their combination.
  • the effects of the regional weighting 440 are evident in figures 4d and 5a, where a centralised region of the frame 442 is emphasised despite the lack of saliency detection in the distinct saliency maps 410, 420, 430.
  • the regional weighting 440 is particularly suitable for sports and news coverage, where the camera operator typically captures the subject in the centre of the frame.
  • the system processing the source video (such as media server 102 illustrated in figure 101) identifies the salient frame regions of the source video from the regionally weighted composite saliency maps created for each frame (such as the composite saliency map 450 illustrated in figure 5a) .
  • the system identifies the salient frame regions of the source video from the regionally weighted composite saliency maps created for each frame (such as the composite saliency map 450 illustrated in figure 5a) .
  • One addressing system that the computing apparatus may use to record salient frame regions involves dividing each frame of the source video into a plurality of image blocks that represent regional frame divisions. Each image block can then be categorised as either salient or non-salient.
  • the saliency of each image block may be assessed and categorised by the server 102 using a threshold procedure.
  • the server may implement the threshold by assigning each image block a saliency value and subsequently comparing the assigned saliency values to a prescribed saliency threshold to evaluate the image blocks saliency. This enables the saliency of each image block to be efficiently evaluated by the server 102 during reproduction of the source video.
  • the saliency values for each image block may be derived from a common saliency indicator.
  • the saliency indicator is the same for each frame of the source video that is reproduced using the threshold procedure.
  • the saliency indicator is preferable derived from pixel characteristics of the source video.
  • Potential saliency indicators include the intensity, colour or orientation of pixel groups within the image block, or changes in pixel group characteristics bounded by a corresponding regional frame division that are indicative of image movement.
  • the results of one motion based saliency detection procedure are illustrated in Figures 6a to 6c.
  • the detection procedure is based on changes in pixel group characteristics between consecutive frames of a source video.
  • the three frames for a video are presented in Figures 6a to 6c.
  • the frames depict a news presenter addressing the camera from a seated position toward the right of each frame.
  • the detection procedure is ideally implemented by the server 102 as a component of a
  • the motion detection procedure illustrated in Figure 6a to 6c initially concentrates on the presenters head movement as she turns toward the camera (the transition sequence between Figures 6a and 6b) .
  • the motion detection procedure detects changes in the position of the presenters head and right arm.
  • detection procedure has divided the changes in Figure 6 into a plurality of sub groupings (as illustrated by the individual detection boxes) . These groupings reflect individual centres of motion (such as the presenters hand, elbow, hair and facial features) .
  • the source video may be reproduced at a lower bit rate (as indicated by step 220 of the flow chart 201) after the salient frame regions have been identified and documented (step 210 of the flow chart 201) .
  • a target bit rate is set for the source video. The target bit rate is then used to control the encoding process.
  • the source video is reproduced at the target bit rate by reducing the number of bits allocated to non-salient frame regions.
  • the computing apparatus identifies the salient regions of a frame from a corresponding saliency map, as indicated in step 222 of the flow chart 201.
  • reproduction of the source video is based on composite saliency maps (such as saliency map 450
  • the computing apparatus preferably encodes the video by dividing each frame into salient and non-salient regions, based on the image block address of salient regions documented in step 201 of the flow chart 201.
  • the image block addresses are ideally defined as macroblocks. This allows the computing apparatus to implement an encoder that complies with the Advanced Video Coding (AVC) standard.
  • AVC compliant embodiments of the encoding system ideally modulate the quantisation parameter (qp) used to encode each macroblock. This allows the encoder to dynamically alter the bit allocation to non-salient (and where applicable salient) frame regions.
  • each image block is allocated a number of bits.
  • the number of bits allocated to a particular image block may be influenced by the dynamics of the frame as well as the saliency of the image block .
  • the encoder reduces the bit allocation to non-salient frame regions to reduce the bit rate of the video
  • the encoder may also reduce the bit allocation to salient frame regions to comply with the target bit rate. However, the encoder allocates a greater number of bits to regions of each frame that have been identified as salient.
  • the server 102 ideally implements a saliency
  • Saliency smoothing reduces abrupt transitions between salient and non-salient frame regions to reduce viewer distraction.
  • the server 102 may implement spatial saliency smoothing and/or temporal smoothing. Spatial saliency smoothing (step 224 of flow chart
  • In-frame saliency smoothing may be achieved by assigning an intermediate bit allocation to pixels that separate salient and non-salient regions of the frame.
  • the server 102 defines a transition buffer between salient and non-salient frame regions of the reproduced video. Pixels in the transition buffer are assigned an intermediate bit allocation.
  • the intermediate bit allocation may vary within the transition buffer, but it is typically bounded by the bit allocation to adjacent salient regions (the upper bound) and non-salient regions (the lower bound) .
  • the intermediate bit allocation preferably decreases with displacement from salient frame regions. This enables the transition buffer to produce a gradual transition from salient to non-salient regions within the frame .
  • Temporal saliency smoothing (step 226 of flow chart
  • Temporal smoothing is ideally applied when a frame region of the reproduced video transitions from salient to non-salient in
  • Transitions from non-salient to salient may not reguire smoothing as the non-salient regions are generally outside the viewer' s central gaze in frame preceding the transition (and therefore less likely to cause distraction) .
  • Temporal smoothing is typically achieved by assigning residual bits to regions of a frame that have recently transitioned from salient to non-salient.
  • the residual bits allocated to recently transitioned frame regions reduce visual distractions by graduating the associated reduction in bit allocation.
  • the recently transitioned frame regions typically maintain greater image detail than other non-salient frame regions for several frames following transition because of the residual bit
  • the residual bit allocation assigned to recently transitioned non-salient regions can be gradually reduced over several frames to accommodate the corresponding reduction in image detail.
  • the number of bits assigned to the transitioning region reduces from an upper bound (typically corresponding to the pre-transition bit allocation) to a lower bound that is generally
  • Temporal smoothing can be implemented by a saliency regulator during encoding by averaging the allocation of bits to individual frame regions .
  • the saliency regulator ideally divides each frame into discrete regions that are referenced by frame divisions (such as image block addresses) .
  • the saliency regulator calculates a dedicated moving average for each regional frame division of the source video during reproduction.
  • the residual bit allocation for transitioning frame divisions is determined from the dedicated moving average .
  • a saliency map for individual frames of a source video may be generated by reconstructing the phase spectrum of an image' s Fourier transform.
  • the disclosed reconstruction requires minimal computational effort, making it suitable for real-time implementation.
  • a quaternion Fourier transform is ideally used to construct the frequency representation of each frame.
  • the quaternion Fourier transform is capable of accommodating a plurality of salient image features (such as colour, intensity and motion) in a holistic manner.
  • a video with a total number of T frames is processed frame by frame as an image I (x, y, t) .
  • the RGB color frame is decomposed into a luminance component (Y) and two chrominance components (Cr and Cb) .
  • the motion feature is calculated by
  • the motion component M(t) captures the temporal saliency between frames with the latency of ( ⁇ ) .
  • the four features can be represented by a quaternion image q(t) .
  • the Quaternion Fourier Transform (QFT) of the quaternion image q(t) is then computed. This corresponds to step 314 of the flow chart 301 illustrated in Figure 3.
  • the frequency representation generated from the quaternion image q (t ) is :
  • Q(t) is the frequency domain representation of q(t) .
  • Q (t is dropped for clarity sake) can be represented as: exp ( ⁇ ) ,
  • is the phase spectrum of Q and ⁇ is a unit pure quaternion.
  • the spatio-temporal saliency map forms the basis of the composite saliency map 450 illustrated in Figure 5a.
  • Various heuristics may be applied to the spatio-temporal saliency map to refine the saliency detection that is achieved. This corresponds to step 330 of the flow chart 301 illustrated in Figure 3.
  • Some possible heuristics that may be applied to the spatio-temporal saliency map (specifically the heuristics documented in steps 332, 334, 336 and 338 of the flow chart 301 illustrated in Figure 3) have been described previously in this specification.
PCT/AU2013/000758 2012-07-09 2013-07-09 Video processing method and system WO2014008541A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2012902926 2012-07-09
AU2012902926A AU2012902926A0 (en) 2012-07-09 Method and apparatus for video processing

Publications (1)

Publication Number Publication Date
WO2014008541A1 true WO2014008541A1 (en) 2014-01-16

Family

ID=49915248

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2013/000758 WO2014008541A1 (en) 2012-07-09 2013-07-09 Video processing method and system

Country Status (1)

Country Link
WO (1) WO2014008541A1 (it)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021118149A1 (en) * 2019-12-09 2021-06-17 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof
WO2022111140A1 (en) * 2020-11-25 2022-06-02 International Business Machines Corporation Video encoding through non-saliency compression for live streaming of high definition videos in low-bandwidth transmission

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002080568A2 (en) * 2001-03-29 2002-10-10 Electronics For Imaging, Inc. Digital image compression
US20060215766A1 (en) * 2005-03-01 2006-09-28 Haohong Wang Region-of-interest coding in video telephony using RHO domain bit allocation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002080568A2 (en) * 2001-03-29 2002-10-10 Electronics For Imaging, Inc. Digital image compression
US20060215766A1 (en) * 2005-03-01 2006-09-28 Haohong Wang Region-of-interest coding in video telephony using RHO domain bit allocation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021118149A1 (en) * 2019-12-09 2021-06-17 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof
US11394981B2 (en) 2019-12-09 2022-07-19 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof
WO2022111140A1 (en) * 2020-11-25 2022-06-02 International Business Machines Corporation Video encoding through non-saliency compression for live streaming of high definition videos in low-bandwidth transmission
US11758182B2 (en) 2020-11-25 2023-09-12 International Business Machines Corporation Video encoding through non-saliency compression for live streaming of high definition videos in low-bandwidth transmission
GB2616998A (en) * 2020-11-25 2023-09-27 Ibm Video encoding through non-saliency compression for live streaming of high definition videos in low-bandwidth transmission

Similar Documents

Publication Publication Date Title
US11290699B2 (en) View direction based multilevel low bandwidth techniques to support individual user experiences of omnidirectional video
JP2021103327A (ja) コンテンツを提供及び表示するための装置及び方法
US8872895B2 (en) Real-time video coding using graphics rendering contexts
KR20190126840A (ko) 근안형 디스플레이들을 위한 압축 방법들 및 시스템들
US20140321561A1 (en) System and method for depth based adaptive streaming of video information
US11153615B2 (en) Method and apparatus for streaming panoramic video
CN109891850A (zh) 用于减少360度视区自适应流媒体延迟的方法和装置
Yuan et al. Spatial and temporal consistency-aware dynamic adaptive streaming for 360-degree videos
US6597736B1 (en) Throughput enhanced video communication
US20130222377A1 (en) Generation of depth indication maps
CN111295884A (zh) 图像处理装置及图像处理方法
JP2013539610A (ja) 低いダイナミックレンジ画像から高いダイナミックレンジ画像の生成
WO2017066346A1 (en) Method and apparatus for optimizing video streaming for virtual reality
US10769754B2 (en) Virtual reality cinema-immersive movie watching for headmounted displays
US11076162B2 (en) Method and network equipment for encoding an immersive video spatially tiled with a set of tiles
Lee et al. Efficient video coding based on audio-visual focus of attention
EP3497935A1 (en) Adaptive video consumption
WO2013108230A1 (en) Distinct encoding and decoding of stable information and transient/stochastic information
US11330309B2 (en) Foviation and HDR
JP5941000B2 (ja) 映像配信装置及び映像配信方法
Yu et al. Convolutional neural network for intermediate view enhancement in multiview streaming
KR101941789B1 (ko) 뷰포트와 타일 크기에 기초한 가상 현실 비디오 전송
WO2014008541A1 (en) Video processing method and system
Oztas et al. A rate adaptation approach for streaming multiview plus depth content
KR102251576B1 (ko) 관심 영역 기반의 vr 영상 수신 장치 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13816022

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13816022

Country of ref document: EP

Kind code of ref document: A1